AMD "Zen 3" is here, and we have with us the Ryzen 9 5950X, the absolute king of the series. This 16-core/32-thread processor is designed to bring the HEDT (high-end desktop) segment down to the mainstream desktop platform; that is, for those who don't care much about a quad-channel memory interface or tons of PCIe lanes. The fact that HEDT chips don't dominate mainstream chips at gaming goes to show that quad-channel isn't all that relevant to the client desktop segment. With the 5950X, you could get yourself HEDT-kind multi-core muscle, higher memory clock headroom to make up the bandwidth deficit, and a more affordable platform since the 5950X is compatible with even mainstream chipsets, such as the AMD B550.
The "Zen 3" microarchitecture introduces a massive 19% IPC uplift over "Zen 2," which already changed the desktop processor market in a big way. Higher IPC means higher single-threaded performance, which conversely means improved gaming performance, with AMD claiming the Ryzen 9 5000-series as having beaten the fastest Intel Core i9 "Comet Lake" part at gaming. Productivity performance should naturally be higher since you have 60% more of these faster cores than the i9-10900K.
The AMD Ryzen 5 5950X in this review uses the "Zen 3" architecture, which is a combination of micro and macro changes to the "Zen" architecture. At a macro level, we see AMD practically doing away with the 4-core CCX design, resulting in each of the two 7 nm "Zen 3" chiplets having a monolithic group of eight cores sharing a large 32 MB L3 cache. At a micro-level, AMD has invested heavily in improving the various components inside the core, resulting in that sweet 19% IPC gain. Together, the higher IPC cores and improved multi-core topology should put the 5950X on firmer ground against not just Intel's LGA1200 "Comet Lake" parts, but also its Core X "Cascade Lake-X" parts that go up to 18 cores.
AMD is launching the Ryzen 9 5950X at $799, making it the priciest mainstream desktop processor, which is right up there with HEDT parts. Since this is a Socket AM4 part that doesn't temper down its clock speeds to afford such a large core count, AMD claims that the 5950X will give you the best of both worlds: leadership AAA gaming performance and productivity that can potentially save you hundreds of dollars compared to an HEDT. In this review, we test the Ryzen 9 5950X against a large selection of processors.
|Price||Cores / |
|Ryzen 7 1800X||$250||8 / 16||3.6 GHz||4.0 GHz||16 MB||95 W||Zen||14 nm||AM4|
|Core i7-8700K||$380||6 / 12||3.7 GHz||4.7 GHz||12 MB||95 W||Coffee Lake||14 nm||LGA 1151|
|Core i7-9700K||$380||8 / 8||3.6 GHz||4.9 GHz||12 MB||95 W||Coffee Lake||14 nm||LGA 1151|
|Core i7-10700K||$380||8 / 16||3.8 GHz||5.1 GHz||16 MB||125 W||Comet Lake||14 nm||LGA 1200|
|Ryzen 7 3700X||$325||8 / 16||3.6 GHz||4.4 GHz||32 MB||65 W||Zen 2||7 nm||AM4|
|Ryzen 7 3800X||$340||8 / 16||3.9 GHz||4.5 GHz||32 MB||105 W||Zen 2||7 nm||AM4|
|Ryzen 7 3800XT||$380||8 / 16||3.9 GHz||4.7 GHz||32 MB||105 W||Zen 2||7 nm||AM4|
|Ryzen 7 5800X||$450||8 / 16||3.8 GHz||4.7 GHz||32 MB||105 W||Zen 3||7 nm||AM4|
|Core i9-10900||$500||10 / 20||2.8 GHz||5.2 GHz||20 MB||65 W||Comet Lake||14 nm||LGA 1200|
|Ryzen 9 3900X||$460||12 / 24||3.8 GHz||4.6 GHz||64 MB||105 W||Zen 2||7 nm||AM4|
|Ryzen 9 3900XT||$470||12 / 24||3.8 GHz||4.7 GHz||64 MB||105 W||Zen 2||7 nm||AM4|
|Ryzen 9 5900X||$550||12 / 24||3.7 GHz||4.8 GHz||64 MB||105 W||Zen 3||7 nm||AM4|
|Core i9-9900K||$390||8 / 16||3.6 GHz||5.0 GHz||16 MB||95 W||Coffee Lake||14 nm||LGA 1151|
|Core i9-9900KS||$800||8 / 16||4.0 GHz||5.0 GHz||16 MB||127 W||Coffee Lake||14 nm||LGA 1151|
|Core i9-10900K||$550||10 / 20||3.7 GHz||5.3 GHz||20 MB||125 W||Comet Lake||14 nm||LGA 1200|
|Ryzen 9 3950X||$720||16 / 32||3.5 GHz||4.7 GHz||64 MB||105 W||Zen 2||7 nm||AM4|
|Ryzen 9 5950X||$800||16 / 32||3.4 GHz||4.9 GHz||64 MB||105 W||Zen 3||7 nm||AM4|
Unboxing and Photography
The Ryzen 9 5950X comes in a fairly large paperboard box. The face of the box features a brushed metal appearance, as opposed to the carbon fiber appearance of the Ryzen 3000 series box. There are enough pointers to let you know you're buying a Ryzen 5000-series part. A small cutout on the side shows the actual processor inside the package.
The processor looks like any conventional AMD CPU with a large IHS dominating the top, and a 1,331-pin micro-PGA in the bottom. The "Zen 3" CCD chiplet is made in Taiwan and the I/O die in the US, and the two are put together at a facility in China.
The retail Ryzen 9 5950X box does not include a cooler. Luckily, it can be paired with a fairly big selection of AM4-compatible coolers that have been released since 2017. Just make sure the cooler can handle thermal loads of at least 105 W.
The Zen 3 Microarchitecture
Since its 2017 debut, AMD has delivered a new iteration of its groundbreaking "Zen" CPU microarchitecture each year, each with IPC improvements. As mentioned earlier, the new "Zen 3" microarchitecture claims to offer a massive 19 percent IPC uplift over "Zen 2," its predecessor. This is accomplished by improvements at both the micro and macro level. We already detailed the macro (beyond the core) changes above. In this section, we talk about what's new inside each core. AMD talks about updates to practically all key components of the core, including its front end, fetch/decode, the integer and floating-point components, load-store, and dedicated caches.
Modern processors execute multiple instructions in parallel to improve performance. Computer programs consist of huge amounts of "if ... then ... else" instructions, which slow down the processor because it has to evaluate the condition first, before picking a branch to execute. In order to overcome this limitation, the branch predictor was invented, which is a piece of circuitry that takes a guess on what's the more likely outcome of the condition check and just speculatively executes that branch's instructions. Of course, there's a chance that the prediction is wrong, in which case a performance penalty is incurred from undoing the executions that were already executed. With "Zen 3," AMD uses an improved TAGE branch predictor, which is more accurate and recovers faster from mispredictions. They also changed the design to be "bubble free," which avoids inserting "wait for result" instructions in the instruction stream whenever a branch is encountered.
AMD generally increased ops/cycle—the front end now switches faster between the op and instruction caches. The 32 KB L1 instruction cache has been tweaked for better utilization through efficient tagging and pre-fetching. Streamlining was done to the Op cache. Improvements to the branch predictor and front end add up to nearly a quarter of the overall 19% generational IPC uplift.
The execution engine, or combination of the integer and floating-point execution units, is the main math muscle of the CPU core. The "Zen 3" microarchitecture features improvements to both over "Zen 2." Both the INT and FP issue queues, which feed work to the two engines, have been widened, and the execution window enlarged. This ensures that fewer units are idle in typical programs, which increases overall performance.
AMD worked to minimize latencies at every stage of the INT execution engine, and enlarged its key structures, including the integer scheduler (96 entry vs. 92 on "Zen 2"), physical register file (192 vs. 180 on "Zen 2"), and 10 issues per cycle, up from 7 on "Zen 2." Data picker bandwidth has been significantly increased despite the same number of ALUs. The floating point engine features the same 256-bit FPUs, but just as with the INT engine, the FP engine has latency and bandwidth improvements across the board, a faster 4-cycle FMAC, and a larger scheduler. The INT and FP improvements contribute around a fifth of the 19% overall IPC uplift.
With the "Zen 3" microarchitecture, AMD addressed many bottlenecks and "intelligence" issues with the Load/Store unit. The biggest has to be bandwidth. The entry store queue has been widened to 64 from 48 on "Zen 2," the L2 cache DTLB is 2K entries wide. The 32 KB L1 data cache has been made faster, with lower latencies. Memory dependence detection has been improved. Much like the front-end and scheduler, the load/store improvements contribute nearly a quarter of the 19% overall IPC uplift, meaning that by just optimizing the non-execution components of its core, AMD managed to pull off a vast 9% overall IPC uplift.
ISA and Security Changes
Each new microarchitecture heralds support for newer instruction sets and security hardening, and the same is the case with "Zen 3." However, a notable absentee is AVX-512. Granted, Intel has adopted a less than perfect method of proliferating AVX-512, with certain instructions being exclusive to enterprise-segment microarchitectures and only a handful client-relevant instructions on its "Ice Lake" and "Tiger Lake" architectures, but there's no movement from AMD in this direction.
You still do get 256-bit instructions from within the AVX2 set. Also missing in action is something to rival Intel's DLBoost, which is essentially a software exposure of fixed-function hardware that accelerates matrix multiplication, in effect AI deep-learning neural net building and training. A lot of client applications, particularly image manipulation and video editing, are leveraging edge AI, and some investment from AMD on this would have been nice. That said, "Zen 3" adds two new ISA instructions, MPK (memory protection keys) and AVX2 support for AES/APCLMulQD. AMD has been ahead of Intel with CPU core security vulnerability perception, and with "Zen 3," AMD is introducing CET, or control-flow enforcement, which should provide hardening against ROP-type attacks.
Vermeer Multi-Chip Module
The AMD Ryzen 9 5950X "Zen 3" processor is built on a Socket AM4 multi-chip module package the company refers to as "Vermeer." Since the Ryzen 3000 "Matisse," which was the first desktop processor to implement the 7 nm silicon fabrication process, AMD figured out a way to optimize the utilization of its 7 nm foundry allocation, using two things—building only those components that tangibly benefit from the new node on 7 nm, namely the CPU cores, while moving all other components to a separate die built on older 12 nm process, the cIOD (client IO die). The CPU cores are built on tiny dies with eight cores each, which AMD refers to as the CCD (CPU core die). On the older "Zen 2" microarchitecture, the eight cores were split into two groups of four cores, each, called CPU core complexes (CCX). Each of the two CCX on the "Zen 2" CCD had its own 16-megabyte L3 cache shared between the two cores, and communication between cores of different CCXes required a round-trip to the cIOD.
With the new "Zen 3" microarchitecture, the biggest high-level change with the CCD is AMD's enlargement of the CCX to now include up to eight cores (essentially taking up the whole CCD). There's now one 8-core CCX per CCD. The biggest dividend of this change has to be improved inter-core latency as the eight cores now share the same L3 cache; the other big dividend has to be cache size. Each core on the CCD now has access to the full 32 MB L3 as a victim cache, so lightly threaded workloads should see a performance uplift. 8-core Ryzen 7 5000-series models, such as the 5800X, feature a single CCD with all its cores enabled. 6-core parts, such as the Ryzen 5 5600X, feature one CCD with any two of its cores disabled (shouldn't matter which ones). The 12-core Ryzen 9 5900X and 16-core Ryzen 9 5950X are parts that have two 8-core CCDs besides the cIOD. The 5900X is carved out by disabling any two cores per CCD, while the 5950X has all cores enabled on both CCDs. We confirmed with AMD that Ryzen 5000 "Vermeer" uses the same exact 12 nm cIOD as the Ryzen 3000 "Matisse," with only a couple of non-physical improvements, such as improved memory clocks and clock domains.
Our Patreon Silver Supporters can read articles in single-page format.