Supermicro announced on January 5, 2026, that it will manufacture and deploy systems based on NVIDIA’s new Vera Rubin NVL72 and HGX Rubin NVL8 platforms, with expanded liquid-cooling infrastructure to support rack-scale AI deployments. The move positions Supermicro as a first-to-market provider for enterprises and hyperscalers adopting NVIDIA’s latest GPU architecture announced at CES 2026.
What’s New
Supermicro is rolling out two flagship products optimized for the NVIDIA Rubin platform. The NVIDIA Vera Rubin NVL72 SuperCluster unifies 72 Rubin GPUs and 36 NVIDIA Vera CPUs into a single rack-scale system delivering 3.6 exaflops of NVFP4 inference performance and 75 TB of fast memory. The compact 2U liquid-cooled NVIDIA HGX Rubin NVL8 system packs 8 GPUs and delivers 400 petaflops of NVFP4 performance for enterprise AI workloads.
The announcement confirms Supermicro has expanded U.S.-based manufacturing capacity specifically for liquid-cooled AI infrastructure. Charles Liang, Supermicro’s CEO, stated the company’s Data Center Building Block Solutions enable faster deployment than competitors through modular design and in-house manufacturing.
NVIDIA officially introduced the Rubin platform at CES 2026 on January 4-6, 2026, with chips entering full production. Systems are expected to ship in the second half of 2026.
Why It Matters
The Vera Rubin NVL72 represents a 5x jump in FP4 inference speed over previous-generation platforms, enabling enterprises to train massive mixture-of-experts models and run long-context AI workloads. With 1.4 PB/s HBM4 memory bandwidth and 260 TB/s NVLink bandwidth per rack, the platform handles data throughput exceeding the entire internet’s capacity.
Liquid cooling becomes essential at this scale. Supermicro’s direct liquid cooling (DLC) technology with in-row Coolant Distribution Units enables warm-water cooling that cuts energy consumption and water usage while maximizing GPU density. This addresses the critical challenge facing hyperscalers: deploying AI compute without expanding power infrastructure.
For IT buyers, Supermicro’s manufacturing expansion means shorter lead times for Rubin-based systems. The modular DCBBS approach allows configuration flexibility with Intel Xeon or AMD EPYC processors, giving enterprises deployment options without vendor lock-in.
Technical Specifications
NVIDIA Vera Rubin NVL72 SuperCluster:
- 72 NVIDIA Rubin GPUs + 36 NVIDIA Vera CPUs
- 20.7 TB total GPU memory (HBM4)
- 3,600 petaflops NVFP4 inference / 2,520 petaflops NVFP4 training
- 75 TB fast memory capacity
- Built on 3rd-gen NVIDIA MGX rack architecture
- Requires liquid cooling with CDU infrastructure
NVIDIA HGX Rubin NVL8 System:
- 8 NVIDIA Rubin GPUs in 2U form factor
- 2.3 TB GPU memory / 160 TB/s HBM4 bandwidth
- 400 petaflops NVFP4 inference
- 28.8 TB/s NVLink bandwidth
- ~24 kW system power usage
- Supports x86 CPUs (Intel Xeon / AMD EPYC)
Key Platform Features
The NVIDIA Vera Rubin platform introduces sixth-generation NVLink interconnects providing 3.6 TB/s per GPU bandwidth for seamless GPU-to-GPU and CPU-to-GPU communication. The custom NVIDIA Vera CPU delivers 2x performance over previous generations with 88 cores, 176 threads, and 1.2 TB/s LPDDR5X memory bandwidth.
NVIDIA’s 3rd-generation Transformer Engine optimizes narrow-precision computations critical for long-context AI workloads. Rack-scale confidential computing provides GPU-level trusted execution environments that isolate models, data, and prompts.
The platform integrates NVIDIA ConnectX-9 SuperNICs and BlueField-4 DPUs for networking, with optional NVIDIA Spectrum-X Ethernet Photonics delivering 5x power efficiency and 10x reliability over traditional optics.
What’s Next
NVIDIA confirmed Vera Rubin chips are in full production as of January 2026, with systems expected to ship in H2 2026. Supermicro has not disclosed specific pricing or order volumes.
Major cloud providers including Microsoft, Amazon, and Meta are reportedly investing billions to secure Rubin-based systems. Supermicro’s expanded manufacturing aims to meet this demand with faster fulfillment.
Enterprises currently running Blackwell or Hopper systems face a decision: upgrade to Rubin for 5x inference gains, or wait for deployment case studies. The liquid-cooling requirement represents a significant infrastructure investment beyond GPU costs.
Supermicro will offer petascale all-flash storage systems with NVIDIA BlueField-4 DPUs as complementary solutions. Further technical documentation is available at supermicro.com/en/accelerators/nvidia/vera-rubin.
Featured Snippet Boxes
What is NVIDIA Vera Rubin NVL72?
The Vera Rubin NVL72 is a rack-scale AI supercomputer combining 72 NVIDIA Rubin GPUs and 36 Vera CPUs into a unified system with 3.6 exaflops of FP4 inference performance. It delivers 75 TB of fast memory and requires liquid cooling infrastructure.
When will Supermicro Rubin systems be available?
NVIDIA announced Rubin chips entered full production in January 2026. Supermicro systems based on the platform are expected to ship in the second half of 2026, though exact availability dates have not been confirmed.
Why is liquid cooling required for Vera Rubin systems?
Vera Rubin NVL72 systems consume significantly higher power than air-cooled infrastructure can handle. Supermicro’s direct liquid cooling technology enables warm-water operation that reduces energy consumption, minimizes water usage, and maximizes GPU density per rack.
How does Vera Rubin compare to previous NVIDIA platforms?
Vera Rubin delivers 5x faster FP4 inference and 3.5x higher FP4 training performance versus previous-generation platforms. It features 2x CPU performance, 3x more memory capacity, and 2x GPU-to-CPU bandwidth compared to prior architectures.

