UNIFIED NETWORK AND COMPUTE ACCELERATION

Experience the unparalleled performance of converged acceleration. NVIDIA H100 CNX combines the performance of the NVIDIA H100 Tensor Core GPU with the advanced networking capabilities of the NVIDIA® ConnectX®-7 Smart Network Interface Card (SmartNIC) to accelerate GPU-assisted, input/output (IO)-intensive workloads, such as e.g. B. Distributed AI training in the enterprise data center and 5G processing at the edge.

BETTER I/O PERFORMANCE

​​​​​​NVIDIA H100 and ConnectX-7 are connected via an integrated PCIe Gen5 switch, which provides a dedicated high-speed path for data transfers between the GPU and network. This eliminates bottlenecks of data going through the host and provides low, predictable latency, which is important for time-sensitive applications such as 5G signal processing.

BALANCED, STREAMLINED DESIGN

The integration of a GPU and a SmartNIC into a single device results in a balanced architecture by design. In systems where multiple GPUs are desired, a converged accelerator card enforces the optimal one-to-one ratio of GPU to NIC.  The design also avoids contention on the server’s PCIe bus, so performance scales linearly with additional devices.

Cost savings

Because the GPU and SmartNIC are connected directly, customers can leverage mainstream PCIe Gen4 or even Gen3 servers to achieve a level of performance only possible with high-end or purpose-built systems.  Using a single card also saves on power, space, and PCIe device slots, enabling further cost savings by allowing a greater number of accelerators per server.

TURNKEY APPLICATION

Core software acceleration libraries such as NVIDIA Collective Communications Library (NCCL) and Unified Communication X (UCX®) automatically use the best performing path for data transfers to GPUs. This allows existing multi-node accelerated applications to take advantage of the H100 CNX without modifications, resulting in immediate improvements.

FASTER AND MORE EFFICIENT AI SYSTEMS

DISTRIBUTED AI TRAINING WITH
​​​​​​​MULTIPLE NODES

When running distributed AI training workloads that involve data transfers between GPUs on different hosts, servers often reach their performance, scalability, and density limitations. Typical enterprise servers do not have a PCIe switch, so the CPU, especially for virtual machines, becomes a bottleneck for this traffic. Data transfers are tied to the speed of the host PCIe backplane. Conflicts can arise from an imbalance between the number of GPUs and NICs. Although a one-to-one ratio would be ideal, the number of PCIe lanes and slots in the server can limit the total number of devices.

​​​​​​​The H100 CNX alleviates this problem. Enabled with a dedicated path from the network to the GPU, GPUDirect® RDMA is enabled to operate at near-linear speeds. Data transfer also occurs at PCIe Gen5 speeds, regardless of the host PCIe backplane. Upscaling GPU performance in a host can be done in a balanced manner as the ideal GPU to NIC ratio is achieved. A server can also be equipped with more acceleration performance, as converged accelerators require fewer PCIe lanes and device slots than discrete cards.

ACCELERATING EDGE AI-ON-5G

AI on 5G Platforms with NVIDIA consists of the hyper-converged NVIDIA EGX™ enterprise platform, the NVIDIA Aerial™ SDK for software-defined 5G virtual radio area networks (vRANs), and enterprise AI frameworks, including SDKs such as NVIDIA Isaac™ and NVIDIA Metropolis. This platform enables edge devices such as video cameras and industrial sensors as well as robots to use AI and communicate with servers via 5G.

​​​​​​​NVIDIA converged accelerators are the most powerful platform for running 5G applications. Because the data does not have to traverse the host PCIe system, processing latency is significantly reduced. The same converged accelerator used to accelerate 5G signal processing can also be used for edge AI, with NVIDIA's Multi-Instance GPU (MIG) technology making it possible to split a GPU for multiple different applications. The H100 CNX makes it possible to provide all of these functions in a single enterprise server without having to deploy more expensive, purpose-built systems.

H100 CNX – SPECIFICATIONS

Technical Specifications
​​​​​​​GPU Memory
80 GB HBM2e
Memory Bandwidth
> 2,0 Tb/s
MIG-Instances
7 instances mit @ 10 GB
3 instances mit @ 20 GB
2 instances mit @ 40 GB
Interconnect
PCIe Gen5 128 GB/s
NVLINK-Bridge
2-Way
Networking
1x 400 Gb/s, 2x 200 Gb/s Ports, Ethernet orInfiniBand
Form Factor
FHFL-Dual-Slot (Full Height, Full Length)
Max. Power
350 W