One unit. Maximum scaling.
A rack as a supercomputer.
With the NVL72 platform, NVIDIA brings a complete rack-scale solution for training, inference and AI reasoning. The systems combine state-of-the-art Grace CPUs with Blackwell GPUs and deliver the highest computing power with maximum efficiency. The NVL72 combines 72 GPUs and 36 Grace CPUs into a single computing domain. Thanks to NVLink and liquid cooling, the entire rack operates as a single, high-performance GPU. This means that even models with trillions of parameters can be trained and made available in real time.

Suitable for every AI workload.
From LLM training to the AI Factory.
GB200 NVL72: For training and inference of large language models, optimised for established workloads with maximum energy efficiency.
GB300 NVL72: For AI factories and reasoning, with 40 TB fast memory and higher throughput per megawatt. Ideal for companies with the highest demands on scaling and speed.
Which NVL72 suits your project?
Comparison of the two platforms at a glance.
Feature | GB300 NVL721 | GB200 NVL72 |
---|---|---|
Configuration | 36 Grace CPU: 72 Blackwell Ultra GPUs | 36 Grace CPU: 72 Blackwell GPUs |
FP4 tensor computing unit | 1.400 | 1,100² PFLOPS | 1.440 PFLOPS |
FP8/FP6 tensor computing unit | 720 PFLOPS | 720 PFLOPS |
INT8 tensor computing unit | 23 POPS | 720 POPS |
FP16/BF16 tensor computing unit | 360 PFLOPS | 360 PFLOPS |
FP32 Tensor computing unit | 6 PFLOPS | 5.760 TFLOPS |
FP64 Tensor computing unit | 100 TFLOPS | 2.880 TFLOPS |
GPU memory | bandwidth | Up to 21 TB | Up to 576 TB/s | Up to 13.5 TB HBM3e | 576 TB/s |
NVLink bandwidth | 130 TB/s | 130 TB/s |
Number of CPU compute units | 2.592 Arm® Neoverse V2 computing units | 2.592 Arm® Neoverse V2 compute units |
CPU memory | bandwidth | Up to 18 TB SOCAMM with LPDDR5X | Up to 14.3 TB/s | Up to 17 TB LPDDR5X | Up to 18.4 TB/s |
1. Preliminary technical data. Subject to change without notice. All technical data for Tensor computing units are with sparsity unless otherwise stated.
2. Without sparsity.
3. With low data density.
NVIDIA SuperPOD.NVL72 as a building block for AI supercomputers.

From rack to cluster:
NVIDIA SuperPOD.
NVL72 as a building block for AI supercomputers.
Several NVL72 racks can be connected via NVLink to form an NVIDIA SuperPOD - a complete AI infrastructure at data centre level. This creates scalable systems that provide ExaFLOP performance for research and industry.
Technological breakthroughs with NVIDIA Blackwell
AI Reasoning & Inference
Test-time scaling and AI reasoning increase the computing requirements to ensure quality and throughput. The tensor cores of the NVIDIA Blackwell GPUs - in the GB300 NVL72 with Blackwell Ultra - offer up to twice the attention layer acceleration and up to 1.5× more AI FLOPS compared to the previous generation.
Up to 288 GB HBM3e
A larger memory enables larger batch sizes and maximum throughput performance. The Blackwell GPUs in the GB200 and GB300 NVL72 utilise HBM3e - in the Ultra version with up to 288 GB per GPU for particularly long context lengths and maximum efficiency.
NVIDIA Blackwell architecture
The NVIDIA Blackwell architecture sets new standards in accelerated computing. It enables a new era of performance, efficiency and scaling - the foundation of both NVL72 platforms.
NVIDIA ConnectX-8 SuperNIC
With the NVIDIA ConnectX-8 SuperNIC, an I/O module with 800 Gb/s per GPU is available for the GB300 NVL72. Together with NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet, the NVL72 achieves top values in Remote Direct Memory Access (RDMA) and workload efficiency.
NVIDIA Grace CPU
The NVIDIA Grace CPU is the link between GPUs and memory in both NVL72 platforms. It combines outstanding performance and high memory bandwidth with twice the energy efficiency of conventional server processors.
Fifth generation NVIDIA NVLink
Seamless GPU-to-GPU communication is crucial for accelerated computing. With NVLink Gen 5, the NVL72 achieves up to 1.8 TB/s bandwidth per GPU and connects all components to a common computing domain.
Get in touch now!
Rely on the best AI infrastructure on the market
With the NVIDIA GB200 NVL72 & GB300 NVL72, you get the technology that the world's leading companies and research institutions are already using successfully. Let's work together to realise your AI vision and prepare your company for the future.
Contact sysGen today and find out how NVIDIA GB200 NVL72 & GB300 NVL72 can revolutionise your AI strategy. Together we can put your AI projects in the fast lane!
Frequently asked questions about NVL72.
NVIDIA Omniverse provides a powerful platform for the development and management of 3D content and simulations. With advanced technologies such as OpenUSD and RTX rendering, companies can significantly improve their design and simulation processes. sysGen supports you with customised hardware solutions, software integration and comprehensive consulting to take full advantage of Omniverse and optimise your 3D workflows.
- Who is the GB200 NVL72 particularly suitable for?
For organisations that want to train and deploy LLMs efficiently, with stable performance and high energy efficiency.
- When is the GB300 NVL72 worthwhile?
For AI factories and reasoning scenarios with maximum throughput requirements.
- Can the NVL72 be extended beyond a single rack?
For example, by connecting several racks to an NVIDIA SuperPOD.
- Will GB200 be obsolete in future?
No, GB200 remains a powerful solution for many workloads. GB300 expands the portfolio with higher capacity and speed.
Discover suitable solutions
NVIDIA SuperPOD
From a single NVL72 rack to a complete AI supercomputer.
NVIDIA DGX & HGX systems
AI computing power in a compact format - as a building block or entry into the NVIDIA infrastructure.
NVIDIA AI For Enterprise
From a single NVL72 rack to a complete AI supercomputer.