NVIDIA Vera Rubin NVL4: specs, availability, and what changes for HPC buyers

NVIDIA Vera Rubin NVL4: specs, availability, and what changes for HPC buyers

NVIDIA's Vera Rubin HPC launch brings Rubin GPUs and Vera CPUs into NVL4 systems for scientific computing, with Q4 2026 availability, 7 exaflops of AI-for-science performance, and 5 petaflops of native FP64 support per rack-scale configuration. This overview separates confirmed specs from the still-undisclosed chip details buyers should wait for.

Sources:...
NVIDIA New Chip Tracker
June 22, 2026 · 10:27 PM
14 subscriptions · 3 items
NVIDIA's June 22 release is an HPC launch, not a gaming-card launch: Vera Rubin NVL4 systems are now the named vehicle for putting Rubin GPUs and Vera CPUs into science-focused racks, with vendor systems expected in Q4 2026. NVIDIA is claiming more than 7 exaflops of AI-for-science performance and 5 petaflops of native FP64 support in a rack-scale configuration with up to 144 GPUs.1
The important caveat: NVIDIA still has not disclosed a conventional standalone Rubin GPU spec sheet with die size, CUDA core count, Tensor Core count, RT core count, PCIe card TDP, or MSRP. This launch is therefore best read as a rack and module launch around Rubin, not a retail accelerator-card launch.

What launched

The June ISC announcement puts Vera Rubin into high-performance computing and scientific AI deployments. NVIDIA said the platform is aimed at workloads such as climate modeling, computational fluid dynamics, quantum chemistry, energy exploration, simulation, AI training, inference, streaming data from instruments, and coupled simulation plus real-time analytics.1
The SKU that matters for this channel is Vera Rubin NVL4. NVIDIA describes it as a liquid-cooled MGX-compatible configuration that connects four Rubin GPUs to two Vera CPUs over NVLink-C2C, then targets scientific simulation, AI-for-science training, and AI-for-science inference.2
NVIDIA Rubin GPU render
Rubin GPU render from NVIDIA's Vera Rubin NVL72 product page; NVIDIA has disclosed rack and GPU performance figures, but not CUDA core count or die-size details for the standalone GPU.3

Spec sheet: disclosed versus still missing

ItemDisclosed valueNotes
Rubin GPU compute, NVFP4 inference50 PFLOPS per GPUNVIDIA's NVL72 product page lists 50 PFLOPS for one Rubin GPU and 100 PFLOPS for one Vera Rubin Superchip.3
Rubin GPU compute, FP6433 TFLOPS per GPUNVIDIA also lists 67 TFLOPS per Vera Rubin Superchip and 2,400 TFLOPS per NVL72 rack.3
Rubin GPU memory288 GB HBM4 per GPUThe same spec table gives 576 GB HBM4 per Vera Rubin Superchip and 20.7 TB HBM4 per NVL72 rack.3
Rubin GPU memory bandwidth22 TB/s per GPUNVIDIA lists 44 TB/s per Vera Rubin Superchip and 1,580 TB/s per NVL72 rack.3
NVLink generationSixth generationNVLink bandwidth is listed at 3.6 TB/s per GPU, 7.2 TB/s per superchip, and 260 TB/s of NVLink 6 switch bandwidth per NVL72 rack.3
Vera CPU cores88 custom NVIDIA Olympus cores per CPUOne Vera Rubin Superchip pairs two Rubin GPUs with one Vera CPU; one NVL72 rack combines 72 Rubin GPUs with 36 Vera CPUs.3
Vera CPU memoryUp to 1.5 TB LPDDR5X per CPUNVIDIA's Vera CPU page also states up to 1.2 TB/s LPDDR5X bandwidth and 176 threads from 88 Olympus cores.4
NVL4 compositionFour Rubin GPUs plus two Vera CPUsNVIDIA says NVL4 connects the four GPUs and two CPUs over NVLink-C2C and fits liquid-cooled MGX modular servers.2
NVL4 availabilityQ4 2026 from global system manufacturersThe June 22 release names Bull, Dell Technologies, GIGABYTE, HPE, and Supermicro as system manufacturers for NVL4-based racks.1
Official priceNot disclosedNVIDIA did not list an MSRP or rack price in the announcement; the press release says features, pricing, availability, and specifications are subject to change.1
For readers used to GeForce or RTX PRO launch coverage, the omissions are just as important as the published numbers. NVIDIA has disclosed throughput, memory, interconnect, rack composition, and availability windows. It has not disclosed the low-level silicon inventory that would make a clean chip-to-chip table against H100, H200, B200, or GB200 possible.

Positioning in NVIDIA's lineup

Vera Rubin sits above the Grace Hopper and Grace Blackwell generation as NVIDIA's next data-center architecture for both AI factories and HPC. NVIDIA's March platform release framed Vera Rubin as seven new chips across compute, networking, and storage: Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, Spectrum-6 Ethernet switch, and Groq 3 LPU.5
The June 22 HPC announcement narrows that architecture to science buyers. Instead of talking mainly about token cost, context windows, and agentic inference, it leads with FP64, CUDA-X libraries, direct liquid cooling, and deployments at supercomputing centers.1
NVIDIA Vera Rubin NVL4 render
NVIDIA positions Vera Rubin NVL4 as the science-oriented module: four Rubin GPUs, two Vera CPUs, NVLink-C2C, and liquid-cooled MGX compatibility.2
The practical buyer split looks like this:
Buyer typeLikely Vera Rubin formWhy it matters
Hyperscale AI labs and cloud providersVera Rubin NVL72 and larger POD-scale designsNVL72 combines 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9, BlueField-4, and NVLink 6 for large-scale AI training and inference.3
National labs and HPC centersVera Rubin NVL4 racksNVIDIA's June 22 release specifically names NVL4 for direct liquid-cooled AI and HPC racks, with Q4 availability from major system vendors.1
CPU-heavy agentic workloadsVera CPU racksA Vera CPU rack integrates up to 256 Vera CPUs and supports more than 22,500 concurrent sandbox environments, according to NVIDIA.2
Storage and context-memory infrastructureVera BlueField-4 STXNVIDIA positions BlueField-4 STX as AI-native storage for KV cache and data movement inside Vera Rubin POD-scale systems.2

What changes versus Grace Hopper and Blackwell

NVIDIA's public comparisons use two different baselines. For HPC, the NVL4 claim is against Grace Hopper: NVIDIA says NVL4 delivers up to 4x performance for scientific simulations, 6x for AI-for-science training, and 8x for AI-for-science inference versus Grace Hopper.2
For AI-factory economics, the NVL72 comparison is against Blackwell. NVIDIA says Vera Rubin NVL72 can train mixture-of-experts models with one-fourth the GPUs compared with GB200 NVL72, deliver up to 10x more tokens per megawatt, and reach one-tenth the cost per million tokens for a specified deep-reasoning workload.3
The engineering change is also physical. NVIDIA's developer deep dive says Vera Rubin NVL72 moves to a redesigned compute tray with a PCB midplane, cable-free and hose-free service access, and an assembly-time reduction from nearly two hours to about five minutes for the compute tray.6
That matters because a rack-scale GPU is no longer just a board-level accelerator. NVIDIA says one Vera Rubin NVL72 rack contains 18 compute trays, nine NVLink switch trays, around 1.3 million individual components, nearly 1,300 chips, 5,000 copper cables in the NVLink spine, and a rack weight of roughly 4,000 pounds.6

Confirmed deployments and availability

The June 22 release names three major HPC adoption tracks. LRZ's Blue Lion will use Vera Rubin and second-generation exascale-class HPE Cray supercomputing, with NVIDIA saying it should deliver about 30x the compute power of LRZ's current system and come online in 2027.1
NERSC's Doudna, a U.S. Department of Energy supercomputer at Lawrence Berkeley National Laboratory, will be a Dell Technologies system powered by Vera Rubin and connected to DOE instruments through ESnet.1 Los Alamos has selected Vera Rubin, Vera CPU, and Quantum-X800 InfiniBand for Mission, Vision, and Veritas systems, with HPE building the machines.1
NVIDIA's May production update gives the supply-chain context: Vera Rubin is ramping through hundreds of ecosystem partners, including 150 in Taiwan, across more than 350 factories and 30 countries. Production shipments are set to begin in the fall, while the June 22 HPC release says NVL4-based systems are expected in Q4 2026.7
Vera Rubin science supercomputer rack
NVIDIA's June ISC image frames Vera Rubin as a rack-scale science system, with the release citing 7 exaflops of AI-for-science performance and 5 petaflops of native FP64 support per rack-scale configuration.1

Buyer's read

For data-center engineers, the cleanest reading is: Rubin has moved from architecture slide to a named, vendor-backed HPC platform. The numbers NVIDIA is willing to publish are rack-level and GPU-throughput numbers: 288 GB HBM4, 22 TB/s HBM bandwidth, 33 TFLOPS FP64, and 50 PFLOPS NVFP4 per Rubin GPU; 4-GPU NVL4 modules for HPC; 72-GPU NVL72 racks for AI factories.3
For procurement teams, the missing fields remain significant. There is no official rack price, no public street pricing, no standalone Rubin PCIe or SXM board detail, and no disclosed power envelope in the sources reviewed here. Treat Q4 2026 NVL4 availability as the launch signal, then wait for Bull, Dell, GIGABYTE, HPE, and Supermicro configurations before comparing total cost, cooling requirements, and cluster topology.

Add more perspectives or context around this Post.

  • Sign in to comment.