Friday, February 18th 2022

AMD Zen3+ Architecture and Ryzen 6000 "Rembrandt" Mobile Processors Detailed

AMD on Thursday unveiled its Ryzen 6000 series "Rembrandt" mobile processors. The company claims these chips offer generational increases in CPU performance, along with big leaps in energy-efficiency and integrated graphics performance. At the heart of these processors is the new 6 nm "Rembrandt" silicon that the company is building on the TSMC N6 silicon fabrication node that leverages EUV lithography.

The "Rembrandt" silicon broadly combines an 8-core/16-thread CPU based on the new Zen 3+ microarchitecture, a large new iGPU based on the RDNA2 graphics architecture, complete with real-time ray tracing support; a DDR5 + LPDDR5 memory controller, and a full PCI-Express Gen4 root-complex. The iGPU, memory interface, and PCIe interface are generational updates over the previous-gen "Cezanne," and it may seem like the CPU is largely unchanged, but AMD claims there are several optimizations that have gone into the CPU to earn the "+" tag.
The biggest engineering investment with "Rembrandt" is its ground-up power-management redesign that heavily leverages power-gating (as opposed to clock-gating). Every major component on the processor, including the individual CPU cores, the individual iGPU compute units, the memory controllers, and the display controller, can be put to sleep through power-gating (cutting their power), and woken up, in millisecond timescale. This allows the processor to use fraction-of-a-second opportunities within your normal use (such as a still screen when reading a document or a web-page), to put select components to sleep. These "power-naps" have a compounding effect on power efficiency, and AMD claims significant battery-life improvements.
The power-optimization of "Rembrandt" is achieved by adopting a five-pronged approach. The first of course is the energy-efficiency gains obtained from the switch to the 6 nm process, which yields a roughly 18% transistor density gain, and improvements with iso-power. The second is optimizations to the CPU microarchitecture, with fine-grained power-gating of individual cores within the CPU. The third includes the SoC-level power optimization that introduces several new power-planes, deep-partitioning of components that allows pretty much all redundant/scalable components to be turned off when not needed. The fourth is firmware-level optimization. The system firmware now has greater interactivity with the OS to understand the nature of the performance demand. And lastly, at the platform-level, AMD works with notebook OEMs to choose the most efficient discrete components that make up the system, along with AMD Advantage device co-engineering.
The new Zen 3+ CPU core comes with over 50 new or updated features over Zen 3. Eight of these key enhancements are a re-engineering of all design-elements to allow for better power leakage; a hardware-assisted wake from sleep, called PC6 Restore; de-coupled L3 cache initialization to allow for faster wake times; per-thread UEFI CPPC capability (as opposed to per-core in the previous-generation); several cache-level power optimization, including the ability to disallow power-down of caches if there are too many cache misses (which saves power in the long run from having to wake up the cache); a granular peak-current control that ramps power as needed instead of an "all-or-nothing" power-up of components; intelligent CPU Core wake sequence that takes into account usage patterns before sleep; and the new Enhanced CC1 state that puts cores to sleep based on low utilization. The "Rembrandt" silicon has one Zen 3+ CCX (CPU core complex) with 8 CPU cores. Each of these has 32 KB of L1I and 32 KB of L1D caches; a dedicated 512 KB L2 cache, and share a 16 MB L3 cache.
The SoC-wide power optimizations include C-states for Infinity Fabric, the interconnect that binds all the components on the SoC. Infinty Fabric clock and bandwidth now scales with workload. The processor now has the ability to reduce SoC-wide power draw by 99%, keeping just the display on in self-refresh mode. The memory controllers, too, can be turned off, leaving the DDR5 memory running in self-refresh. The fast sleep and restore accelerators provide the largest chunk of AMD's power optimization, so individual components can be put to sleep and woken up in millisecond intervals. These include the CPU cores, the iGPU CUs, the Infinity Fabric, the memory controller, and display engine. Platform-level power optimization incorporate power savings from the use of LPDDR5 memory, displays with less than 1 W power draw, panel self-refresh capability, where in power is saved in display data transmission if there's nothing new on the screen, and panel delta updates (the ability to refresh only select regions of the panel (eg: the one that's displaying a real-time clock or system notifications).

AMD introduced several firmware-level power optimizations. The system firmware now works along with drivers to achieve greater interoperability with the OS to help with management of power, performance, thermals, and acoustics relative to every workload scenario. Windows 11 does away with slider-based power-performance scaling, and so AMD's power management works in the background to automatically scale performance to your needs.

The company works with notebook manufacturers to introduce several display panel-level power-optimizations, including support for new Z-power states, which give the platform the ability to completely power down the display controller; getting OEMs to use new SVI3 voltage regulator for display panels; various device design optimizations from AMD Advantage co-engineering; support for new-generations of display panels with typical power-draw under 1 W; and the new AMD FreeSync PSR-SU (panel self-refresh and selective-update) technology.

FreeSync PSR-SU dynamically brings down refresh-rate to sync with what's being displayed on screen. If there's a 24 FPS video playback, panel refresh-rate is brought down to match the frame-rate. Selective update allows different regions of the display to update at a different rate. Display Stream Compression and Forward Error Correction (DSC and FEC) are leveraged to reduce the number of embedded DisplayPort TMDS lanes, resulting in additional power savings.
The PCI-Express interface sees an update to PCI-Express Gen4 spec. The processor now puts out 8 PCI-Express Gen 4 lanes toward a discrete GPU, 4 Gen4 lanes toward a CPU-attached M.2 NVMe SSD, and the remaining 4 lanes toward chipset-bus.

The Radeon 600M series integrated graphics solution leverages the company's latest RDNA2 graphics architecture. It features 12 compute units amounting to 768 stream processors, 48 TMUs, 16 ROPs, and 12 Ray Accelerators. The only things setting this iGPU apart from the discrete Radeon RX 6400 is its ROP count (16 vs. 32), lack of Infinity Cache, and lack of dedicated memory. There are two iGPU models based on the number of CUs enabled. The Radeon 680M comes with all 12 CUs enabled, while the Radeon 660M has 6 of them enabled (amounting to 384 stream processors).

AMD claims that when combined with the right settings, and FidelityFX Super Resolution (FSR), the Radeon 680M provides sufficient performance for 1080p gaming, while the 660M should provide enough for today's visually-intensive non-gaming accelerated workloads. The Video CoreNext (VCN) component is the same as the one found in RX 6800-series discrete GPUs, and provides hardware-accelerated decoding of 10-bit AV1, VP9, and HEVC.

When it comes to gaming performance, AMD claims a nearly 2X performance lead over the Xe LP-based Iris Xe iGPU powering the Core i7-1185G7, which has 96 EUs. AMD is also claiming performance in the league of the GeForce GTX 1650 Max-Q "Turing" discrete GPU, which means it is already beating most GeForce MX series discrete GPUs found in entry-level gaming notebooks. For enthusiast-segment gaming notebooks, AMD is pushing the Radeon RX 6800S as the discrete GPU of choice, squaring off against the combination of the previous-generation Ryzen 9 5900HS and RTX 3080 Laptop GPU.

AMD is debuting the Ryzen 6000 series with 10 processor models, which include two models for the Thin-and-Light segment (15 to 28 W), four models for the Thin Enthusiast segment (35 W class), and four for the Ultra Enthusiast (45 W and >45 W) category. All segments have 8-core/16-thread SKUs across the Ryzen 7 series. The 35 W and 45 W segments include Ryzen 9 SKUs with higher clock-speeds; The Ryzen 5 series are 6-core/12-thread parts across all segments.
AMD is claiming a 2.62X lead over the Core i9-12900HK "Alder Lake" processor in CPU performance-per-Watt, measured using Cinebench R20 nT. The Ryzen 9 6900HS scored 5733 points in this test, compared to 6894 points for the i9-12900HK (the Intel chip is 20% faster). AMD boasts about the fact that in comparable categories of 28 W, AMD offers 8 performance cores, compared to Intel's P-core count of 6, whereas Intel 15 W category chips only have 2 P-cores and rely heavily on the E-core clusters.

The 15 W AMD Ryzen 7 6800U with its 8-core/16-thread CPU posts big performance gains over the previous-generation 5800U, as well as Intel's previous-generation i7-1185G7 "Tiger Lake-U" processor, which is a 28 W-category chip. The 6800U can be configured for 28 W, where it posts even higher performance across a number of use-cases.

The Ryzen 9 6900HX isn't even the top part from this series, but is shown to post anywhere between 8% to 47% performance leads over the 8-core/16-thread Core i9-11980HK, the previous-generation flagship based on the 8-core "Tiger Lake" silicon.

And lastly, AMD is claiming up to 24 hours of battery life for its 15 W-class and 28 W-class notebooks, which culminates its power-optimizations, Adaptive Power Control framework, and the 6 nm silicon fabrication process.

Performance and Efficiency Claims by AMD
AMD summarizes the various I/O capabilities of "Rembrandt" in this slide. The chip supports 40 Gbps-capable USB4 without the need for discrete controllers, PCI-Express 4.0 for discrete graphics as well as CPU-attached NVMe; DDR5 and LPDDR5 memory support; the latest generation AMD+MediaTek WiFi 6E + Bluetooth LE 5.2 wireless interfaces; Microsoft Pluton-based feature-rich TPM, the latest generation display outputs including HDMI 2.1 and DisplayPort 2.1, and acceleration for the latest video formats.

The complete AMD slide-deck follows.
Show 56 Comments