NVIDIA Vera Rubin NVL4: specs, availability, and what changes for HPC buyers

NVIDIA's June 22 release is an HPC launch, not a gaming-card launch: Vera Rubin NVL4 systems are now the named vehicle for putting Rubin GPUs and Vera CPUs into science-focused racks, with vendor systems expected in Q4 2026. NVIDIA is claiming more than 7 exaflops of AI-for-science performance and 5 petaflops of native FP64 support in a rack-scale configuration with up to 144 GPUs.1

The important caveat: NVIDIA still has not disclosed a conventional standalone Rubin GPU spec sheet with die size, CUDA core count, Tensor Core count, RT core count, PCIe card TDP, or MSRP. This launch is therefore best read as a rack and module launch around Rubin, not a retail accelerator-card launch.

What launched

The June ISC announcement puts Vera Rubin into high-performance computing and scientific AI deployments. NVIDIA said the platform is aimed at workloads such as climate modeling, computational fluid dynamics, quantum chemistry, energy exploration, simulation, AI training, inference, streaming data from instruments, and coupled simulation plus real-time analytics.1

The SKU that matters for this channel is Vera Rubin NVL4. NVIDIA describes it as a liquid-cooled MGX-compatible configuration that connects four Rubin GPUs to two Vera CPUs over NVLink-C2C, then targets scientific simulation, AI-for-science training, and AI-for-science inference.2

NVIDIA Rubin GPU render — Rubin GPU render from NVIDIA's Vera Rubin NVL72 product page; NVIDIA has disclosed rack and GPU performance figures, but not CUDA core count or die-size details for the standalone GPU.3

Spec sheet: disclosed versus still missing

Item	Disclosed value	Notes
Rubin GPU compute, NVFP4 inference	50 PFLOPS per GPU	NVIDIA's NVL72 product page lists 50 PFLOPS for one Rubin GPU and 100 PFLOPS for one Vera Rubin Superchip.3
Rubin GPU compute, FP64	33 TFLOPS per GPU	NVIDIA also lists 67 TFLOPS per Vera Rubin Superchip and 2,400 TFLOPS per NVL72 rack.3
Rubin GPU memory	288 GB HBM4 per GPU	The same spec table gives 576 GB HBM4 per Vera Rubin Superchip and 20.7 TB HBM4 per NVL72 rack.3
Rubin GPU memory bandwidth	22 TB/s per GPU	NVIDIA lists 44 TB/s per Vera Rubin Superchip and 1,580 TB/s per NVL72 rack.3
NVLink generation	Sixth generation	NVLink bandwidth is listed at 3.6 TB/s per GPU, 7.2 TB/s per superchip, and 260 TB/s of NVLink 6 switch bandwidth per NVL72 rack.3
Vera CPU cores	88 custom NVIDIA Olympus cores per CPU	One Vera Rubin Superchip pairs two Rubin GPUs with one Vera CPU; one NVL72 rack combines 72 Rubin GPUs with 36 Vera CPUs.3
Vera CPU memory	Up to 1.5 TB LPDDR5X per CPU	NVIDIA's Vera CPU page also states up to 1.2 TB/s LPDDR5X bandwidth and 176 threads from 88 Olympus cores.4
NVL4 composition	Four Rubin GPUs plus two Vera CPUs	NVIDIA says NVL4 connects the four GPUs and two CPUs over NVLink-C2C and fits liquid-cooled MGX modular servers.2
NVL4 availability	Q4 2026 from global system manufacturers	The June 22 release names Bull, Dell Technologies, GIGABYTE, HPE, and Supermicro as system manufacturers for NVL4-based racks.1
Official price	Not disclosed	NVIDIA did not list an MSRP or rack price in the announcement; the press release says features, pricing, availability, and specifications are subject to change.1

For readers used to GeForce or RTX PRO launch coverage, the omissions are just as important as the published numbers. NVIDIA has disclosed throughput, memory, interconnect, rack composition, and availability windows. It has not disclosed the low-level silicon inventory that would make a clean chip-to-chip table against H100, H200, B200, or GB200 possible.

Positioning in NVIDIA's lineup

Vera Rubin sits above the Grace Hopper and Grace Blackwell generation as NVIDIA's next data-center architecture for both AI factories and HPC. NVIDIA's March platform release framed Vera Rubin as seven new chips across compute, networking, and storage: Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, Spectrum-6 Ethernet switch, and Groq 3 LPU.5

The June 22 HPC announcement narrows that architecture to science buyers. Instead of talking mainly about token cost, context windows, and agentic inference, it leads with FP64, CUDA-X libraries, direct liquid cooling, and deployments at supercomputing centers.1

NVIDIA Vera Rubin NVL4 render — NVIDIA positions Vera Rubin NVL4 as the science-oriented module: four Rubin GPUs, two Vera CPUs, NVLink-C2C, and liquid-cooled MGX compatibility.2

The practical buyer split looks like this:

Buyer type	Likely Vera Rubin form	Why it matters
Hyperscale AI labs and cloud providers	Vera Rubin NVL72 and larger POD-scale designs	NVL72 combines 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9, BlueField-4, and NVLink 6 for large-scale AI training and inference.3
National labs and HPC centers	Vera Rubin NVL4 racks	NVIDIA's June 22 release specifically names NVL4 for direct liquid-cooled AI and HPC racks, with Q4 availability from major system vendors.1
CPU-heavy agentic workloads	Vera CPU racks	A Vera CPU rack integrates up to 256 Vera CPUs and supports more than 22,500 concurrent sandbox environments, according to NVIDIA.2
Storage and context-memory infrastructure	Vera BlueField-4 STX	NVIDIA positions BlueField-4 STX as AI-native storage for KV cache and data movement inside Vera Rubin POD-scale systems.2

What changes versus Grace Hopper and Blackwell

NVIDIA's public comparisons use two different baselines. For HPC, the NVL4 claim is against Grace Hopper: NVIDIA says NVL4 delivers up to 4x performance for scientific simulations, 6x for AI-for-science training, and 8x for AI-for-science inference versus Grace Hopper.2

For AI-factory economics, the NVL72 comparison is against Blackwell. NVIDIA says Vera Rubin NVL72 can train mixture-of-experts models with one-fourth the GPUs compared with GB200 NVL72, deliver up to 10x more tokens per megawatt, and reach one-tenth the cost per million tokens for a specified deep-reasoning workload.3

The engineering change is also physical. NVIDIA's developer deep dive says Vera Rubin NVL72 moves to a redesigned compute tray with a PCB midplane, cable-free and hose-free service access, and an assembly-time reduction from nearly two hours to about five minutes for the compute tray.6

That matters because a rack-scale GPU is no longer just a board-level accelerator. NVIDIA says one Vera Rubin NVL72 rack contains 18 compute trays, nine NVLink switch trays, around 1.3 million individual components, nearly 1,300 chips, 5,000 copper cables in the NVLink spine, and a rack weight of roughly 4,000 pounds.6

Confirmed deployments and availability

The June 22 release names three major HPC adoption tracks. LRZ's Blue Lion will use Vera Rubin and second-generation exascale-class HPE Cray supercomputing, with NVIDIA saying it should deliver about 30x the compute power of LRZ's current system and come online in 2027.1

NERSC's Doudna, a U.S. Department of Energy supercomputer at Lawrence Berkeley National Laboratory, will be a Dell Technologies system powered by Vera Rubin and connected to DOE instruments through ESnet.1 Los Alamos has selected Vera Rubin, Vera CPU, and Quantum-X800 InfiniBand for Mission, Vision, and Veritas systems, with HPE building the machines.1

NVIDIA's May production update gives the supply-chain context: Vera Rubin is ramping through hundreds of ecosystem partners, including 150 in Taiwan, across more than 350 factories and 30 countries. Production shipments are set to begin in the fall, while the June 22 HPC release says NVL4-based systems are expected in Q4 2026.7

Vera Rubin science supercomputer rack — NVIDIA's June ISC image frames Vera Rubin as a rack-scale science system, with the release citing 7 exaflops of AI-for-science performance and 5 petaflops of native FP64 support per rack-scale configuration.1

Buyer's read

For data-center engineers, the cleanest reading is: Rubin has moved from architecture slide to a named, vendor-backed HPC platform. The numbers NVIDIA is willing to publish are rack-level and GPU-throughput numbers: 288 GB HBM4, 22 TB/s HBM bandwidth, 33 TFLOPS FP64, and 50 PFLOPS NVFP4 per Rubin GPU; 4-GPU NVL4 modules for HPC; 72-GPU NVL72 racks for AI factories.3

For procurement teams, the missing fields remain significant. There is no official rack price, no public street pricing, no standalone Rubin PCIe or SXM board detail, and no disclosed power envelope in the sources reviewed here. Treat Q4 2026 NVL4 availability as the launch signal, then wait for Bull, Dell, GIGABYTE, HPE, and Supermicro configurations before comparing total cost, cooling requirements, and cluster topology.