NVIDIA DGX-2, the fastest path to AI scale on a whole new level

DGX-2 supercomputer may become a survival factor for Enterprises


Today’s business needs to scale-out AI, without scaling-up cost or complexity

  • Powered by DGX software

  • Accelerated AI-at-scale deployment and effortless operations

  • Unrestricted model parallelism and faster time-to-solution


The DGX-2 builds on its predecessor DGX-1 in several respects, but features a new, revolutionary NVSwitch with 300 GB/s chip-to-chip communication at 12 times PCIe speed. With NVLink2, sixteen GPUs with a total width of 14 TB/s can be combined into a single GPU.

With two Xeon CPUs, 1.5 TB RAM and 30 TB NVMe memory, the DGX-2 consumes ~ 10KW, weighs a little less than 159 KG and offers a computing power of ~2 PFLOPs per single system when using the tensor cores.

Compare both servers (source NVIDIA):



What makes the DGX-2 a unique new system, going far beyond previous systems?

Unbeatable Compute Power for Unprecedented Training


AI is getting increasingly complex and demands unprecedented levels of compute power. NVIDIA® DGX-2 packs the power of 16 of the world’s most advanced GPUs to accelerate new AI model types that were previously untrainable. Plus, it enables groundbreaking GPU scalability, so you can train 4X bigger models on a single node with 10X the performance of an 8-GPU system.


DGX-2 UNIFIED MEMORY




Unified Memory provides:

  • Single memory view shared by all GPUs

  • Automatic migration of data between GPUs

  • User control of data locality


DGX-2 NVME SSD STORAGE

Rapidly ingest the largest datasets into cache:

  • Faster than SATA SSD, optimized for transferring huge datasets

  • Dramatically larger user scratch space

  • The protocol of choice for next-gen storage technologies

  • 8x 3.84TB NVMe in RAID0 (Data)

  • 25.5 GB/sec Sequential Read bandwidth (vs. 2 GB/sec for 7TB of SAS SSDs on DGX-1)


DGX-2 LATEST GENERATION CPU AND 1.5TB SYSTEM MEMORY

Faster, more resilient, boot and storage management:

  • More system memory to handle larger DL and HPC applications

  • 2 Intel Skylake Xeon Platinum 8168 - 2.7GHz, 24 cores

  • 24x 64GB DIMM System Memory


DGX-2 THE ULTIMATE IN NETWORKING FLEXIBILITY

Grow your DL cluster effortlessly, using the connectivity you prefer:

  • Support for RDMA over Converged Ethernet (ROCE)

  • 8 EDR Infiniband/ 100 GigE

  • 1600 Gb/sec Total Bi-directional Bandwidth with low-latency

  • o Also supports Ethernet mode: Dual 10/25 Gb/sec


DGX-2 DGX-2 Part Overview




DGX Family COMMON SOFTWARE STACK ACROSS DGX FAMILY

One software stack for the whole Family:

  • Single, unified stack for deep learning frameworks

  • Predictable execution across platforms

  • The protocol of choice for next-gen storage technologies

  • Pervasive reachSupport for RDMA over Converged Ethernet (ROCE)


DGX Family NVIDIA GPU CLOUD



Optimized Stacks for Every Cloud