AMD EPYC 7003 processor series

Introducing the third generation AMD EPYC™, the world's most powerful server processor.

AMD recently introduced the new third-generation AMD EPYC™ server processors, which include the world's most powerful server processor, the EPYC™ 7763. The AMD EPYC™ 7003 series boosts your business productivity by enabling faster application performance.

EPYC™ in numbers

AMD EPYC™ is designed for data centers that rely on CPU performance and throughput. From hyper-converged infrastructure to databases to Big Data analytics and high-performance computing, workloads have more cores to work with. AMD EPYC™ 7003 Series processors scale from 8 to 64 cores (or 16 to 128 threads per socket). No other x86 server processor achieves this level of core density.

With AMD EPYC™ 7003, you get superior workload performance to deliver the resources your applications need.
AMD EPYC Benefits Image
The new processor series AMD EPYC 7003 promises more performance and it keeps this promise, but this also has its price. The architecture is responsible for the performance increase, because the Zen-3 cores are designed for double-digit growth compared to the predecessor EPYC 7002, codenameRome, with Zen 2 architecture

Comparision Between Zen2 and Zen3 Image
AMD has published reference data in comparison to Intel in the context of the release of the new AMD EPYC 7003. If you look at the bare numbers, AMD currently offers significantly more with up to 64 cores per socket, 128 PCI Express 4.0 lanes and eight memory channels for DDR4-3200 than Intel does until the release of the new generation "Ice Lake-SP". Take a look at this for yourself.
AMD Vs Intel Performance Image

Memory interleaving

With the EPYC processors of the 7003 series, AMD introduces memory interleaving for the use of only six of the eight available memory channels. The memory interleaving ensures more favorable memory configurations with 4-, 6- or, like the predecessors, with an 8-channel memory interface here. DDR4-3200 is supported as the highest clock with one DIMM per channel, and the maximum memory expansion is again 4 TBytes when using all eight channels and a total of 16 slots.
EPYC Cpu Memory Interleaving Image
AMD Epyc 7003 supports 8-, 6- and 4-channel memory (Image: AMD)

The most important element is your data

In a world of complexity, you need to make your workloads more secure, from processes to your data. AMD Infinity Guard is a suite of advanced security features built into silicon to defend against internal and external threats attacking your data and reduce potential attack surfaces when booting and running software and processing your critical data. with AMD Secure Encrypted Virtualization technologies, reinforced by Secure Nested Paging, AMD is contributing to breakthrough advances in data security, such as confidential computing.
Hardware Validated Boot Image

Microarchitecture overview

Processor compute cores, memory controller, I/O controller, RAS and security features are all integrated into the AMD EPYC™ 7003 Series System on Chip (SoC) processor. The AMD EPYC 7003 processor retains the proven multi-chip module (MCM) chiplet architecture of previous successful server-class AMD EPYC processors, while making further improvements and upgrading the processing units to new "Zen 3" cores.
EPYC 7003 Configuration with 8 CCD and IOD Image
Figure 1 AMD EPYC 7003 configuration with 8 Core Complex Dies (CCD) and Central I/O Die (IOD)

"Zen 3" Core

The EPYC 7003 series processor is based on new "Zen 3" computing cores. The "Zen 3" core is manufactured in a 7nm process and is designed to achieve an increase in instructions per cycle (IPC) over the previous generation "Zen" cores. Each core supports Simultaneous Multi-Threading (SMT), which when enabled allows 2 threads per core to run simultaneously. Each core has an optimized 32 KB L1 cache and a private 512 KB unified (instruction/data) L2 cache.

Core Complex (CCX) and Core Complex Die (CCD)

As shown in Figure 2, up to eight "Zen 3" core computing units share a last-level L3 cache, which can be up to 32 MB in size. This grouping is called Core Complex (CCX). When Simultaneous Multithreading (SMT) is enabled on each core, a CCX can support up to 16 concurrent hardware threads, up to 4 MB of L2 cache and up to 32 MB of L3 cache - shareable across all cores within the CCX. In the EPYC 7003 series, a single CCX is housed in a single package called a core complex die (CCD).
CCX and CCD Image
Figure 2 Eight Compute Cores comprise a Core Complex (CCX) within a single die or CCD
EPYC 7003 Processor Internal Topology Image
Figure 3 EPYC 7003 Processor internal topology between CCDs, IO, and DDR Memory Interface within IOD Infinity Fabric

I/O Die (Infinity Fabric™)

The CCDs are connected to memory, I/O and each other via the I/O die (IOD). All CCDs are connected to the IOD via a dedicated high-speed or GMI (Global Memory Interconnect) link. The IOD also contains memory channels, PCIe® Gen4 lanes and Infinity Fabric interconnects. All chips, or chiplets, are interconnected via AMD's Infinity Fabric technology. The fabric clock (FCLK) can now go up to 1600Mhz, allowing it to be coupled with DDR4-3200 memory DIMMs that also run at 1600MHz (MEMCLK), which increases memory latency.
EPYC 7003 SoC Image
Figure 4 EPYC 7003 System on Chip (SoC): 8 CCDs and central IOD

Memory and I/O

In terms of the memory subsystem, the EPYC 7003 brings additional power and a new 6-way interleave mode. Each processor in the EPYC 7003 series has 8 Universal Memory Controllers (UMC). Each UMC or memory channel can support up to 2 DIMMs per channel (DPC), for a maximum of sixteen DIMMs per socket. A single processor can support 4TB of DDR4 memory. While 8 memory channels are most common and generally provide the best performance, the IOD also provides the flexibility to support 4 and even 6 memory channel configurations. Each processor has eight x16-bit I/O links that provide up to 128 lanes of high-speed PCIe Gen4 I/O to the PCIe subsystem for single-socket platforms.

NUMA Topology

The AMD EPYC 7003 series processors use a Non-Uniform Memory Access (NUMA) microarchitecture. In addition, system BIOS settings allow a user to optimize this NUMA topology for their specific operating environment and workload. The NUMA Nodes Per Socket (NPS) BIOS setting can be used to set up a system with different NUMA configurations.

for example, if we set NPS=4, as shown in Figure 4 above, we can divide the processor into quadrants
.
Each quadrant would then have 2 CCDs, 2 UMCs, and 1 I/O hub as shown in this figure. The smallest process-memory I/O distance is between cores, memory controllers, and I/O within the same quadrant. The farthest distance is between a core and memory controller or I/O hub in diagonal quadrants. Locating cores, memory, and IO in a NUMA-based system is an important aspect of performance tuning. In addition, the NPS setting also controls the interleave pattern of the memory controllers. For each NUMA node, all channels within that NUMA node are interleaved. A setting of NPS=4 partitions the processor into four NUMA domains. Each logical quadrant of the processor is configured as its own NUMA domain. Memory is interleaved across the two memory channels in each quadrant. PCIe devices are located in one of the four NUMA domains of the processor, depending on which quadrant of the IO die contains the PCIe root for that device.

a setting of NPS=1, on the other hand, means a single NUMA node per socket. This setting configures all memory channels on the processor into a single NUMA domain, i.e. all cores on the processor, all attached memories and all PCIe devices connected to the SoC are in one NUMA domain. Memory accesses are interleaved across all eight memory channels into a single address space. As the granularity of the NPS setting increases, the number of interleaved channels decreases accordingly. A setting of NPS=2 configures 2 NUMA domains per socket that interleave corresponding four memory channels within the same 4 CCD NUMA domain. Half of the cores and half of the memory channels of each SoC are grouped into one NUMA domain, and the remaining cores and memory channels are grouped into a second domain. Memory is interleaved across the four memory channels in each NUMA domain. Additionally, in certain environments, performance can be further improved by assigning workloads with compute cores that all share a single LLC. The LLC (Last Level Cache or L3 cache) as a NUMA BIOS setting makes this capability visible. Enabling this setting equates each CCD to a separate NUMA domain, as one unique L3 cache per CCD. A single 7003 processor with 8 CCDs would have 8 NUMA nodes. In summary, a single 7003 series EPYC processor supports configurations ranging from a single NUMA node system to up to 8 NUMA nodes per socket.




Dual socket configurations

EPYC 7003 series processors can be supported in single-socket or dual-socket system configurations. Processors with the 'P' suffix in their name are designed for single-socket configurations. In two-socket configurations, both processors must be identical. Two different processor OPNs or steppings cannot be used in the same 2-socket system. In 2-socket systems, two EPYC 7003 series SoCs are connected via their respective Infinity Fabric or External Global Memory Interconnect (xGMI) links. This creates a high-bandwidth, low-latency connection between the two processors. System builders can use either 3 or 4 Infinity Fabric.

connections depending on the I/O and bandwidth targets of the system design. In a 2-socket system, there are a total of 16 memory channels, 8 per socket. The Infinity Fabric Links use the same physical connections as the PCIe lanes in the system. So in a two-socket configuration, up to half of the 128 PCIe lanes on each socket become Infinity Fabric links. Since each socket still has 64 PCIe lanes, the system still has a total of 128 PCIe lanes per system. In some cases, a system designer may want more PCIe lanes for the system. In these cases, a system designer can allocate up to 160 lanes for PCIe (80 per socket), leaving 48 lanes per socket for Infinity Fabric links, instead of 64. A dual-socket system can potentially be configured in several ways, including as a 1, 2, 4, 8, or 16 NUMA domain system.
EPYC Processor Connection Image
Figure 5 Two EPYC 7003 Processors connect through 4 xGMI links (NPS1)

Leadership in workload performance

Driving solutions in the cloud, on-prem or off-prem, in containers or VMs. Whether on bare metal or HCI, the AMD EPYC™ 7003 Series delivers outstanding performance across the broad spectrum of standard applications.

As the leader in high-performance computing, AMD continues to raise the bar for computing in the data center by continuously executing on its roadmap and championing innovation.
Data sheet
Workstation Performance Image

The secret lies beneath the surface.

Based on the "Zen 3" core and AMD Infinity Architecture, the new AMD EPYC™ 7003 Series processors offer a comprehensive feature set with industry-leading I/O, 7nm x86 CPU technology and an integrated security processor on die. ePYC™ 7003 CPUs offer up to 32 MB of L3 cache per core, 4-6-8 memory channel interleaving for better savings and performance in multiple DIMM configurations, and synchronized clocks between fabric and memory, all leading to better and faster time to results.

Capture the full value of your
IT investment

From traditional application delivery to the latest innovations, AMD EPYC™ 7003 processors give you the system resources and capacity your applications need.

As of January 2021, there are over 132 AMD EPYC™ powered VMware® certified platforms. The broad ecosystem and support for open tools and libraries are additional reasons why top cloud providers choose AMD.

AMD EPYC™ powered servers deliver the results you need, when you need them, and help you quickly leverage the impact on your business - no matter how, where or when your applications run.
EPYC benefits Image

Model specifications

EPYC in numbers

AMD EPYC™ is designed for data centers that rely on CPU performance and throughput. From hyper-converged infrastructure to databases to Big Data analytics and high-performance computing, workloads need more cores to work with. AMD EPYC™ 7003 Series processors scale from 8 to 64 cores (or 16 to 128 threads per socket). No other x86 server processor achieves this level of core density.

With AMD EPYC™ 7003, you get superior workload performance to deliver the resources your applications need.