Designed for the performance and efficiency required for modern AI data centers

The ever-increasing complexity of AI models and their requirements has greatly increased the importance of accelerated computing and energy efficiency in the data center. In this context, the NVIDIA Grace™ CPU marks a milestone. As a groundbreaking Arm® CPU, it sets new standards by delivering unrivaled performance and efficiency without compromise. The flexibility of the Grace CPU is remarkable: it can seamlessly interface with a GPU to optimize accelerated computing, or stand alone as a powerful and efficient CPU.

The versatility of the NVIDIA Grace CPU spans multiple configurations to meet the diverse needs of today's data centers. From high-performance servers to compute-intensive applications, it provides a solid foundation for next-generation data centers. With its ability to adapt to different deployment scenarios, the Grace CPU enables optimal utilization of resources, significantly increasing efficiency and improving performance.

Take a look at the Grace lineup

Grace Superchip CPU

NVIDIA GB200 NVL72

The NVIDIA GB200 Grace Blackwell Superchip combines two NVIDIA Blackwell Tensor Core GPUs and a Grace CPU and can be scaled to the GB200 NVL72, a massive 72-GPU system connected via NVIDIA® NVLink®, to deliver 30x faster real-time inference for large-scale language models.


Read more
Read press release

NVIDIA Grace Superchip-CPU

The NVIDIA Grace Superchip CPU utilizes NVLink C2C technology to provide 144 Arm® Neoverse V2 compute units and 1 TB/s of memory bandwidth.


Learn more
Grace Superchip CPU
Grace Superchip CPU

NVIDIA Grace Hopper Superchip

NVIDIA Grace Hopper™ Superchip combines the Grace and Hopper architectures using NVIDIA® NVLink®-C2C to provide a coherent CPU and GPU memory model for accelerated AI and High-Performance Computing (HPC) applications.


Learn more

Discover Grace reference designs for
modern data center workloads

The complexity and size of AI models is increasing rapidly. They are enhancing deep recommender systems with tens of terabytes of data, improving conversational AI with hundreds of billions of parameters and enabling new scientific discoveries. Scaling these massive models requires new architectures with fast access to a large memory pool and tight coupling of CPU and GPU. The NVIDIA Grace™ CPU provides the high performance, energy efficiency and high-bandwidth connectivity that can be used in multiple configurations for different data center requirements.

System designs for digital twins, artificial intelligence and high-performance computers.

NVIDIA OVX™

for digital twins and NVIDIA Omniverse™.
NVIDIA Grace CPU Superchip
NVIDIA GPUs
NVIDIA BlueField®-3

NVIDIA HGX™

for HPC.
NVIDIA Grace CPU Superchip
NVIDIA BlueField-3
OEM-defined input/output (IO)

NVIDIA HGX

for AI training, inference, and HPC.
NVIDIA Grace Hopper Superchip CPU + GPU
NVIDIA BlueField-3
OEM-defined IO / fourth-generation NVLink

Find out more about the latest technical innovations

Acceleration of CPU-to-GPU connections with NVLink-C2C

Solving the biggest AI and HPC problems requires high capacity, high bandwidth memory (HBM). The fourth-generation NVIDIA NVLink-C2C provides 900 gigabytes per second (GB/s) of bidirectional bandwidth between the NVIDIA Grace CPU and NVIDIA GPUs. The link provides a unified, cache-coherent memory address space that combines system and HBM GPU memory for simplified programmability. This coherent, high-bandwidth connection between CPU and GPUs is the key to accelerating tomorrow's most complex problems.

Use CPU memory with high bandwidth with LPDDR5X

NVIDIA Grace is the first server CPU to leverage LPDDR5X memory with server-class reliability through mechanisms such as error correction code (ECC) to meet the needs of the data center, while delivering two times higher memory bandwidth and up to ten times better energy efficiency compared to currently available server memory. The NVIDIA Grace CPU integrates ARM Neoverse V2 compute units into an NVIDIA-developed Scalable Coherency Fabric to deliver high performance in an advantageous design to make work easier for scientists and researchers.

More performance and efficiency with Arm Neoverse V2 cores

Even though the parallel computing capabilities of GPUs continue to advance, workloads can still be limited by serial tasks running on the CPU. A fast and efficient CPU is a critical component of the system design to enable optimal workload acceleration. The NVIDIA Grace CPU integrates Arm Neoverse V2 cores to deliver high performance in a low-power system, making work easier for scientists and researchers.

More generative AI with HBM3 and HBM3e GPU memory

Generative AI is memory and compute intensive. The NVIDIA GB200 superchip uses 380 GB of HBM memory, providing more than 4.5 times the GPU memory bandwidth of the NVIDIA H100 Tensor Core GPU. Grace Blackwell's high-bandwidth memory is combined with CPU memory via NVLink-C2C to provide nearly 800GB of fast-access memory for the GPU. This delivers the memory capacity and bandwidth required for the world's most complex generative AI and accelerated computing workloads.