
Rubin GPUs can deliver 50 petaflops for inference using NVFP4 data format—five times faster than Blackwell—and hit 35 petaflops for NVFP4 training, which is 3.5 times faster than Blackwell. HBM4 memory offers 22 Tbps bandwidth — 2.8x over Blackwell — and NVLink bandwidth per GPU is 3.6 Tbps, double Blackwell’s speed.
Networking is enhanced with the liquid-cooled NVLink 6 Switch, offering 400G SerDes, 3.6 Tbps per-GPU bandwidth, total switching bandwidth of 28.8 Tbps, and 14.4 teraflops of FP8 in-network compute capability.
The complete platform gives the Vera Rubin NVL72 platform up to 3.6 exaflops of NVFP4 inference, which is five times faster than the previous generation platform, and up to 2.5 exaflops of NVFP4 training, 3.5 times higher than the previous generation.
Vera Rubin NVL72 includes 54 TB of LPDDR5x capacity (2.5x Blackwell), 20.7 TB HBM4 (50% more), 1.6 Pbps HBM4 bandwidth (2.8x increase), and a scale-up bandwidth of 260 Tbps (double that of Blackwell NVL72). “That’s more bandwidth than the entire global Internet,” said Harris.
Nvidia also redesigned the rack, announcing its Third-Gen NVL72 Rack Resiliency. Features include a cable-free modular tray design that enables assembly and servicing 18 times faster than the previous generation.
The NVLink Intelligent Resiliency feature supports server maintenance with “zero downtime,” keeping racks operational even during component swaps or partial population. The second-generation RAS Engine allows for GPU diagnostics without taking the rack offline.
