We have with us the ASUS ROG STRIX Radeon RX 6700 XT O12G, the company's top custom-design RX 6700 XT graphics card. AMD today launched the RX 6700 XT along with the new Navi 22 silicon to take the fight to NVIDIA's sub-$500 performance segment, including popular SKUs such as the GeForce RTX 3060 Ti and the RTX 3070. It establishes 12 GB as the new memory-size standard and promises full DirectX 12 Ultimate capability, including real-time raytracing. The target buyer is a serious gamer looking to play both AAA and e-sports at 1440p with maxed out settings.
The Radeon RX 6700 XT is based on the latest RDNA2 graphics architecture that brings a compelling feature set to the PC market, nearly leveling up to NVIDIA. AMD's approach to real-time raytracing is to use special purpose components called Ray Accelerators, which calculate ray-intersection, while compute shaders handle everything else, including de-noising. To achieve this, AMD significantly bolstered the programmable shaders of this architecture, improving their IPC and running them at extremely high engine clocks. A side-effect of this is the card's high performance outlook with games that don't have raytracing.
The Navi 22 silicon at the heart of the RX 6700 XT physically features 40 RDNA2 compute units, which works out to 2,560 stream processors, 40 Ray Accelerators, 160 TMUs, and 64 ROPs. The increased 12 GB memory size over the past generation comes with a catch—the memory bus width is narrowed to 192-bit. AMD attempted to overcome this by using the fastest standard GDDR6 memory chips that run at 16 Gbps and Infinity Cache, a fast on-die 96 MB cache memory that cushions data transfers between the GPU and memory.
The ROG STRIX RX 6700 XT OC comes with the latest generation DirectCU III cooler that's used across RTX 30-series and RX 6000 series ROG cards from ASUS. It features a meaty aluminium fin-stack heatsink that's ventilated by a trio of Axial-Tech fans. You also get goodies such as dual-BIOS, a blinding amount of RGB LED illumination, additional fan headers to let you synchronize your ventilation to the card, and an additional RGB out. According to ASUS, the MSRP of the RX 6700 STRIX OC is $830. I would have estimated it to end up at $800 after scalping, looks like I might be wrong.
Our Radeon RX 6700 XT launch-day coverage includes six articles including this one. Do check them out!
AMD Radeon RX 6700 XT (reference) | MSI Radeon RX 6700 XT Gaming X | Sapphire Radeon RX 6700 XT NITRO+ | PowerColor Radeon RX 6700 XT Red Devil | XFX Radeon RX 6700 XT Speedster Merc 319
|RX Vega 64||$400||4096||64||1247 MHz||1546 MHz||953 MHz||Vega 10||12500M||8 GB, HBM2, 2048-bit|
|RX 5700 XT||$370||2560||64||1605 MHz||1755 MHz||1750 MHz||Navi 10||10300M||8 GB, GDDR6, 256-bit|
|RTX 2070||$340||2304||64||1410 MHz||1620 MHz||1750 MHz||TU106||10800M||8 GB, GDDR6, 256-bit|
|RTX 3060||$600||3584||48||1320 MHz||1777 MHz||1875 MHz||GA106||13250M||12 GB, GDDR6, 192-bit|
|RTX 2070 Super||$450||2560||64||1605 MHz||1770 MHz||1750 MHz||TU104||13600M||8 GB, GDDR6, 256-bit|
|Radeon VII||$680||3840||64||1400 MHz||1800 MHz||1000 MHz||Vega 20||13230M||16 GB, HBM2, 4096-bit|
|RTX 2080||$600||2944||64||1515 MHz||1710 MHz||1750 MHz||TU104||13600M||8 GB, GDDR6, 256-bit|
|RTX 2080 Super||$690||3072||64||1650 MHz||1815 MHz||1940 MHz||TU104||13600M||8 GB, GDDR6, 256-bit|
|RTX 3060 Ti||$700||4864||80||1410 MHz||1665 MHz||1750 MHz||GA104||17400M||8 GB, GDDR6, 256-bit|
|RX 6700 XT||$700|
|2560||64||2424 MHz||2581 MHz||2000 MHz||Navi 22||17200M||12 GB, GDDR6, 192-bit|
|ASUS RX 6700 XT|
|$830||2560||64||2424 MHz||2629 MHz||2000 MHz||Navi 22||17200M||12 GB, GDDR6, 192-bit|
|RTX 2080 Ti||$1000||4352||88||1350 MHz||1545 MHz||1750 MHz||TU102||18600M||11 GB, GDDR6, 352-bit|
|RTX 3070||$800||5888||96||1500 MHz||1725 MHz||1750 MHz||GA104||17400M||8 GB, GDDR6, 256-bit|
|RX 6800||$1000||3840||96||1815 MHz||2105 MHz||2000 MHz||Navi 21||26800M||16 GB, GDDR6, 256-bit|
|RX 6800 XT||$1300||4608||128||2015 MHz||2250 MHz||2000 MHz||Navi 21||26800M||16 GB, GDDR6, 256-bit|
|RTX 3080||$1300||8704||96||1440 MHz||1710 MHz||1188 MHz||GA102||28000M||10 GB, GDDR6X, 320-bit|
|RX 6900 XT||$1500||5120||128||2015 MHz||2250 MHz||2000 MHz||Navi 21||26800M||16 GB, GDDR6, 256-bit|
|RTX 3090||$2000||10496||112||1395 MHz||1695 MHz||1219 MHz||GA102||28000M||24 GB, GDDR6X, 384-bit|
For AMD, a lot is riding on the success of the new RDNA2 graphics architecture as it powers not just the Radeon RX 6000 series graphics cards, but also the GPU inside next-generation game consoles designed for 4K Ultra HD gaming with raytracing—a really tall engineering goal. AMD was first to market with a 7 nm GPU more than 15 months ago, using the original RDNA architecture and Navi. The company hasn't changed its process node, but implemented a host of new technologies after having acquired experience with the node. The Radeon RX 6700 XT is powered by AMD's new 7 nm Navi 22 silicon built on the same TSMC 7 nm silicon fabrication node as the Big Navi. The chip measures 336 mm² and crams in 17.2 billion transistors, putting it in the same league as NVIDIA's 8 nm GA104 silicon that powers the RTX 3070. The die talks to the outside world with a 192-bit wide GDDR6 memory interface, a PCI-Express 4.0 x16 host interface, and display I/O that's good for multiple 4K or 8K displays due to DSC.
New design methodologies and component-level optimization throughout the silicon along with new power-management features allowed AMD to achieve two breakthroughs that enabled double the compute unit counts over the previous generation while staying within a reasonable power envelope. Firstly, the company managed to halve the power draw per CU while adding a 30% increase in engine clocks, which can both be redeemed for performance gain per CU.
The RDNA2 compute unit is where a bulk of the magic happens. Arranged in groups of two called Dual Compute Units, which share instruction and data caches, the RDNA2 compute unit still packs 64 stream processors (128 per Dual CU) and has been optimized for increased frequencies, new kinds of math precision, new hardware that enables the Sampler Feedback feature, and the all-important Ray Accelerator, a fixed-function hardware component that calculates up to one triangle or four box ray intersections per clock cycle. AMD claims the Ray Accelerator makes intersection performance up to ten times faster than if it were executed with compute shaders. AMD also redesigned the render backends of the GPU from the ground up, towards enabling features such as Variable Rate Shading (both tier 1 and tier 2). At 64, the ROP count remains the same as for the previous-generation Navi 10.
Overall, the Navi 22 silicon has essentially the same component hierarchy as Navi 10. The Infinity Fabric interconnect is the link that binds all the components together. At the outermost level, you have the chip's 192-bit GDDR6 memory controllers, a PCI-Express 4.0 x16 host interface, and the multimedia and display engines which have been substantially updated from RDNA. A notch inside is the chip's 96-megabyte Infinity Cache, which we detail below. This cache is the town square for the GPU's high-speed 4 MB L2 caches and the graphics command processor, which dispatches the workload among two shader engines. Each of these shader engines packs 10 RDNA2 Dual Compute Units (or 20 CUs) along with the updated render backends and L1 cache. Combined, the silicon has 2,560 stream processors across 40 CUs, 40 Ray Accelerators (1 per CU), 160 TMUs, and 64 ROPs. In every sense except the memory, the Navi 22 is half a Navi 21.
The Radeon RX 6700 XT maxes out the Navi 22 silicon by enabling all 40 RDNA2 compute units. The card comes with 12 GB of GDDR6 memory running at 16 Gbps (GDDR6-effective) across the chip's 192-bit wide memory interface, which works out to 384 GB/s of memory bandwidth. The Infinity Cache runs at the highest possible 1.5 TB/s data-rate, while AMD claims that the engine clock can spike well above 2.50 GHz, with a 2.42 GHz "game clock."
Infinity Cache, or How AMD is Blunting NVIDIA's G6X Advantage
Despite its lofty design goals and a generational doubling in memory size to 12 GB, the RX 6700 XT has a rather unimpressive memory setup compared to NVIDIA's RTX 3070, or even AMD's own previous-generation RX 5700 XT. That is, at least on paper, with just a 192-bit bus width and JEDEC-standard 16 Gbps GDDR6, which works out to 384 GB/s raw bandwidth. Competing NVIDIA cards use 14 Gbps memory, but over a wider 256-bit memory interface. Memory compression secret sauce can at best increase effective bandwidth by a high single-digit percent.
AMD took a frugal approach to this problem, not wanting to invest in expensive HBM+interposer based solutions, which would have thrown overall production costs way off balance. AMD looked at how their Zen processor team leveraged large last-level caches on EPYC processors to significantly improve performance and carried the idea over to the GPU. A large chunk of the Navi 22 silicon die area now holds what AMD calls the "Infinite Cache," which is really just a new L3 cache that is 96 MB in size and talks to the GPU's four shader engines over a 1024 bit interface. This cache has an impressive bandwidth of 1.5 TB/s and can be used as a victim cache by the 4 MB L2 caches of the two shader engines.
The physical media of Infinity Cache is the same class of SRAM as for the L3 cache on Zen processors. It offers four times the density of 4 MB L2 caches, lower bandwidth in comparison, but four times the bandwidth over GDDR6. It also significantly reduces energy consumption, by a sixth for the GPU to fetch a byte of data compared to doing so from GDDR6 memory. I'm sure the questions on your mind are what difference 96 MB makes and why no one has done this earlier.
To answer the first question, even with just 96 MB spread across two slabs of 48 MB each, Infinity Cache takes up a large amount of the die area of the Navi 22 silicon, and AMD's data has shown that much of the small workloads involved in raytracing and raster operations are bandwidth rather than memory-size intensive. Having a 96 MB fast victim cache running at extremely low latencies compared to DRAM helps. As for why AMD didn't do this earlier, it's only now that there's an alignment of circumstances where the company can afford to go with a fast 96 MB victim cache as opposed to just cramming in more CUs to get comparable levels of performance, but for less power consumption—as a storage rather than a logic device, spending die area on Infinity Cache instead of more CUs does result in power savings.
The ASUS Radeon RX 6700 XT STRIX OC shares the same impressive looks and cooler shroud with other STRIX cards from the GeForce 30 and Radeon RX 6000 Series. I really like the new look ASUS created; it's clean, yet powerful with its metal highlights on the front and back of the card.
Dimensions of the card are 32 x 14 cm, and it weighs 1664 g.
Installation requires three slots in your system.
Display connectivity includes three standard DisplayPort 1.4 and one HDMI 2.1.
The card has two 8-pin power inputs. This configuration is rated for up to 375 W of power draw.
Two fan headers near the back of the card can be used to connect case fans to the graphics card. These fans will now run in sync with the graphics card fans—stopped when idle and at increasing speed depending on the GPU temperature. Since the graphics card is the primary heat source in most computers, this makes a lot of sense and helps keep noise levels down.
The AMD Radeon RX 6000 series doesn't support multi-GPU. Instead, ASUS has placed their dual BIOS switch in this area. The "P"erformance BIOS is the default, and the "Q"uiet BIOS will run a more relaxed fan curve to reduce noise levels.
Our Patreon Silver Supporters can read articles in single-page format.