HomeNewsNVIDIA Announces Rubin Platform at CES 2026 with 10x Inference Cost Reduction

NVIDIA Announces Rubin Platform at CES 2026 with 10x Inference Cost Reduction

Published on

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

NVIDIA introduced its Rubin platform at CES 2026 in Las Vegas, marking the next generation of AI computing infrastructure. The platform unites six new chips Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch designed to deliver up to 10x reduction in inference token costs compared to previous generations. DGX SuperPOD with Rubin-based systems will ship in the second half of 2026.

What’s New in the Rubin Platform

The Rubin platform debuts five major technology upgrades built specifically for agentic AI, mixture-of-experts models, and long-context reasoning. Sixth-generation NVIDIA NVLink delivers 3.6TB/s per GPU and 260TB/s per Vera Rubin NVL72 rack, enabling massive parallel workloads without model partitioning. The NVIDIA Vera CPU features 88 custom Olympus cores with full Armv9.2 compatibility and ultrafast NVLink-C2C connectivity.

The Rubin GPU provides 50 petaflops of NVFP4 compute with a third-generation Transformer Engine featuring hardware-accelerated compression. Each DGX Rubin NVL8 system delivers 5.5x NVFP4 FLOPS compared to NVIDIA Blackwell systems. CEO Jensen Huang stated during his CES 2026 keynote that Rubin arrives “at exactly the right moment, as AI computing demand for both training and inference is going through the roof”.

Why the 10x Cost Reduction Matters

The 10x reduction in inference token generation cost addresses the growing expense of deploying large language models and AI agents at scale. As AI models expand in size, context length, and reasoning depth, inference costs have become a critical bottleneck for enterprise adoption.

This economic shift makes real-time AI coding assistants, million-token video processing, and enterprise-scale agentic AI financially viable for organizations of all sizes. The improved cost efficiency enables companies to process significantly more AI inference workloads without proportional infrastructure investment increases.

DGX SuperPOD Configurations

NVIDIA offers two Rubin-based DGX SuperPOD deployment options:

DGX Vera Rubin NVL72

  • 8 rack systems with 576 Rubin GPUs total
  • 28.8 exaflops of FP4 performance
  • 600TB of fast memory
  • 36 Vera CPUs, 72 Rubin GPUs, and 18 BlueField-4 DPUs per rack
  • Unified memory space across entire rack

DGX Rubin NVL8

  • 64 systems with 512 Rubin GPUs total
  • Liquid-cooled form factor with x86 CPUs
  • 8 Rubin GPUs per system with sixth-gen NVLink
  • Designed as an efficient on-ramp for existing AI projects

Both configurations integrate BlueField-4 DPUs for secure infrastructure, NVIDIA Mission Control for automated orchestration, and support 800Gb/s networking via Quantum-X800 InfiniBand or Spectrum-X Ethernet.

What Comes Next

NVIDIA DGX SuperPOD systems built on the Rubin platform will become available in the second half of 2026. CEO Jensen Huang confirmed at CES 2026 that the next-generation chips are “in full production,” signaling an aggressive rollout timeline.

Third-generation NVIDIA Confidential Computing will debut on Vera Rubin NVL72 as the first rack-scale platform maintaining data security across CPU, GPU, and NVLink domains. NVIDIA Mission Control software currently available for Blackwell systems will extend support to Rubin-based DGX systems for enterprise infrastructure automation.

The platform’s second-generation RAS Engine enables real-time health monitoring, fault tolerance, and 3x faster servicing through modular cable-free trays. Partners will begin rolling out Rubin-based products and services throughout the latter half of 2026.

Featured Snippet Boxes

What is the NVIDIA Rubin platform?

The NVIDIA Rubin platform is a next-generation AI computing architecture comprising six chips: Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch, designed to reduce inference costs by 10x while accelerating agentic AI and long-context models.

When will DGX SuperPOD with Rubin be available?

NVIDIA DGX SuperPOD systems powered by Rubin architecture will ship in the second half of 2026, with CEO Jensen Huang confirming the chips are already in full production as of January 2026.

How does Rubin compare to Blackwell performance?

Each DGX Rubin NVL8 system delivers 5.5x NVFP4 FLOPS compared to NVIDIA Blackwell systems, with inference running five times faster and triple the overall speed, plus significantly improved energy efficiency per watt.

What is DGX Vera Rubin NVL72?

DGX Vera Rubin NVL72 is a rack-scale AI system combining 36 Vera CPUs, 72 Rubin GPUs, and 18 BlueField-4 DPUs with 260TB/s of NVLink throughput, enabling unified memory and compute space across the entire rack without model partitioning.

Mohammad Kashif
Mohammad Kashif
Senior Technology Analyst and Writer at AdwaitX, specializing in the convergence of Mobile Silicon, Generative AI, and Consumer Hardware. Moving beyond spec sheets, his reviews rigorously test "real-world" metrics analyzing sustained battery efficiency, camera sensor behavior, and long-term software support lifecycles. Kashif’s data-driven approach helps enthusiasts and professionals distinguish between genuine innovation and marketing hype, ensuring they invest in devices that offer lasting value.

Latest articles

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

Xcode 26.5 Beta Ships Swift 6.3 and an iOS SDK That Lays Groundwork for Maps Ads

Xcode 26.5 beta (17F5012f) arrived on March 30, 2026, and it carries more developer impact than a typical point release. Swift 6.3 ships as the new default compiler, five platform SDKs move forward simultaneously, and

macOS Tahoe 26.5 Beta 1 Quietly Tests RCS Encryption Again and Lays the Foundation for Apple Maps Ads

Apple released macOS Tahoe 26.5 Beta 1 on March 29, 2026, less than a week after macOS 26.4 reached Mac hardware worldwide. Most coverage frames this as a routine maintenance drop.

iOS 26.5 Beta Flips RCS Encryption Back On, Puts Ads Inside Apple Maps, and Expands EU Wearable Access

Apple dropped iOS 26.5 beta 1 (build 23F5043g) on March 29, 2026, one week after iOS 26.4 shipped to the public. Siri watchers will find nothing new here. But the update carries three changes significant enough to

More like this

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

Xcode 26.5 Beta Ships Swift 6.3 and an iOS SDK That Lays Groundwork for Maps Ads

Xcode 26.5 beta (17F5012f) arrived on March 30, 2026, and it carries more developer impact than a typical point release. Swift 6.3 ships as the new default compiler, five platform SDKs move forward simultaneously, and

macOS Tahoe 26.5 Beta 1 Quietly Tests RCS Encryption Again and Lays the Foundation for Apple Maps Ads

Apple released macOS Tahoe 26.5 Beta 1 on March 29, 2026, less than a week after macOS 26.4 reached Mac hardware worldwide. Most coverage frames this as a routine maintenance drop.