Due to the widening chip crisis and the resulting, significant price increases of the major IT manufacturers, online price calculations are currently not possible. We therefore point out that price inquiries via our website may differ from the final offer!
These cookies are necessary for the basic functions of the shop.
"Allow all cookies" cookie
"Decline all cookies" cookie
Matches with only "firstvisit"
These cookies are used to make the shopping experience even more appealing, for example for the recognition of the visitor.
Multi-Instance GPU (MIG) expands the performance and value of each NVIDIA A100 Tensor Core GPU. MIG can partition the A100 GPU into as many as seven instances, each fully isolated with their own high-bandwidth memory, cache, and compute cores. Now administrators can support every workload, from the smallest to the largest, offering a right-sized GPU with guaranteed quality of service (QoS) for every job, optimizing utilization and extending the reach of accelerated computing resources to every user.
Expand GPU Access to More Users
With MIG, you can achieve up to 7X more GPU resources on a single A100 GPU. MIG gives researchers and developers more resources and flexibility than ever before
Optimize GPU Utilization
MIG provides the flexibility to choose many different instance sizes, which allows provisioning of right-sized GPU instance for each workload, ultimately delivering optimal utilization and maximizing data center investment.
Run Simultaneous Mixed Workloads
MIG enables inference, training, and high-performance computing (HPC) workloads to run at the same time on a single GPU with deterministic latency and throughput.
HOW THE TECHNOLOGY WORKS
Without MIG, different jobs running on the same GPU, such as different AI inference requests, compete for the same resources like memory bandwidth. A job consuming larger memory bandwidth starves others, resulting in several jobs missing their latency targets. With MIG, jobs run simultaneously on different instances, each with dedicated resources for compute, memory, and memory bandwidth, resulting in predictable performance with quality of service and maximum GPU utilization.
Achieve Ultimate Data Center Flexibility
An NVIDIA A100 GPU can be partitioned into different-sized MIG instances. For example, an administrator could create two instances with 20 gigabytes (GB) of memory each or three instances with 10 GB or seven instances with 5 GB. Or a mix of them. So Sysadmin can provide right-sized GPUs to users for different types of workloads.
MIG instances can also be dynamically reconfigured, enabling administrators to shift GPU resources in response to changing user and business demands. For example, seven MIG instances can be used during the day for low-throughput inference and reconfigured to one large MIG instance at night for deep learning training.
Deliver Exceptional Quality of Service
Each MIG instance has a dedicated set of hardware resources for compute, memory, and cache, delivering guaranteed quality of service (QoS) and fault isolation for the workload. That means that failure in an application running on one instance doesn’t impact applications running on other instances. And different instances can run different types of workloads—interactive model development, deep learning training, AI inference, or HPC applications. Since the instances run in parallel, the workloads also run in parallel—but separate and isolated—on the same physical A100 GPU.
MIG is a great fit for workloads such as AI model development and low-latency inference. These workloads can take full advantage of A100’s features and fit into each instance’s allocated memory.
WATCH MIG IN ACTION
BUILT FOR IT AND DEVOPS
MIG enables fine-grained GPU provisioning by IT and DevOps teams. Each MIG instance behaves like a standalone GPU to applications, so there is no change to the CUDA® platform. MIG can be used in all the major enterprise computing environments.