
A rack as a supercomputer.
With the NVL72 platform, NVIDIA brings a complete rack-scale solution for training, inference and AI reasoning. The systems combine state-of-the-art Grace CPUs with Blackwell GPUs and deliver the highest computing power with maximum efficiency. The NVL72 combines 72 GPUs and 36 Grace CPUs into a single computing domain. Thanks to NVLink and liquid cooling, the entire rack operates as a single, high-performance GPU. This means that even models with trillions of parameters can be trained and made available in real time.
NVIDIA GPU, CPU, Netzwerk- und KI-Softwaretechnologien
AI Reasoning & Inference
Test-time scaling and AI reasoning increase the computing requirements to ensure quality and throughput. The tensor cores of the NVIDIA Blackwell GPUs - in the GB300 NVL72 with Blackwell Ultra - offer up to twice the attention layer acceleration and up to 1.5× more AI FLOPS compared to the previous generation.
NVIDIA ConnectX-8 SuperNIC
With the NVIDIA ConnectX-8 SuperNIC, an I/O module with 800 Gb/s per GPU is available for the GB300 NVL72. Together with NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet, the NVL72 achieves top values in Remote Direct Memory Access (RDMA) and workload efficiency.
NVIDIA Grace CPU
The NVIDIA Grace CPU is the link between GPUs and memory in both NVL72 platforms. It combines outstanding performance and high memory bandwidth with twice the energy efficiency of conventional server processors.
Fifth generation NVIDIA NVLink
Seamless GPU-to-GPU communication is crucial for accelerated computing. With NVLink Gen 5, the NVL72 achieves up to 1.8 TB/s bandwidth per GPU and connects all components to a common computing domain.
Suitable for every AI workload.
From LLM training to the AI Factory.
GB200 NVL72: For training and inference of large language models, optimised for established workloads with maximum energy efficiency.
GB300 NVL72: For AI factories and reasoning, with 40 TB fast memory and higher throughput per megawatt. Ideal for companies with the highest demands on scaling and speed.

Which NVL72 suits your project?
Comparison of the two platforms at a glance.
Feature | GB300 NVL721 | GB200 NVL72 |
---|---|---|
Configuration | 36 Grace CPU: 72 Blackwell Ultra GPUs | 36 Grace CPU: 72 Blackwell GPUs |
FP4 tensor computing unit | 1.400 | 1,100² PFLOPS | 1.440 PFLOPS |
FP8/FP6 tensor computing unit | 720 PFLOPS | 720 PFLOPS |
INT8 tensor computing unit | 23 POPS | 720 POPS |
FP16/BF16 tensor computing unit | 360 PFLOPS | 360 PFLOPS |
FP32 Tensor computing unit | 6 PFLOPS | 5.760 TFLOPS |
FP64 Tensor computing unit | 100 TFLOPS | 2.880 TFLOPS |
GPU memory | bandwidth | Up to 21 TB | Up to 576 TB/s | Up to 13.5 TB HBM3e | 576 TB/s |
NVLink bandwidth | 130 TB/s | 130 TB/s |
Number of CPU compute units | 2.592 Arm® Neoverse V2 computing units | 2.592 Arm® Neoverse V2 compute units |
CPU memory | bandwidth | Up to 18 TB SOCAMM with LPDDR5X | Up to 14.3 TB/s | Up to 17 TB LPDDR5X | Up to 18.4 TB/s |
1. Preliminary technical data. Subject to change without notice. All technical data for Tensor computing units are with sparsity unless otherwise stated.
2. Without sparsity.
3. With low data density.
NVIDIA SuperPOD.NVL72 as a building block for AI supercomputers.
From rack to cluster:
NVIDIA SuperPOD.
NVL72 as a building block for AI supercomputers.
Several NVL72 racks can be connected via NVLink to form an NVIDIA SuperPOD - a complete AI infrastructure at data centre level. This creates scalable systems that provide ExaFLOP performance for research and industry.
Discover suitable solutions
NVIDIA SuperPOD
From a single NVL72 rack to a complete AI supercomputer.
NVIDIA DGX & HGX systems
AI computing power in a compact format - as a building block or entry into the NVIDIA infrastructure.
NVIDIA AI For Enterprise
From a single NVL72 rack to a complete AI supercomputer.