2nd Generation Vega Architecture
It may appear like the underlying architecture of Radeon VII is exactly the same as the RX Vega 64 from 2017, codenamed "Vega," but that's not entirely true. Radeon VII is based on what AMD refers to as "Enhanced 2nd Generation Vega architecture."
What's new are optimizations that increase frequency by tapping into the engine clock headroom dividend from the switch to 7 nm, reduced latencies across the silicon (think caches, memory controllers, buffers, etc.), increased bandwidth to the 64 ROPs, and notable changes to the "Vega" NGCU (next generation compute unit) that include additional integer and floating-point accumulators, which probably increases IPC a few percentage points.
AMD has also doubled down on the chip's power-management features, beginning with something it calls "Enhanced Thermal Monitoring." The company has sprinkled double the number of temperature sensors across the GPU die. Clock-speed control is now based on junction temperature, which represents data from a larger network of sensors that lets the GPU more accurately control its frequency and voltages, which translates to better sustainability of boost frequencies. This also means the GPU can throttle itself more accurately to maintain reliability of the silicon. AMD presented its own tests that show throttling based on junction temperature rather than edge temperature results in a two percent performance uplift due to better boost frequency sustainability.
The most obvious improvement, though, is the memory interface, which is 4096-bit wide, twice that of "Vega 10," which doubles the memory bandwidth. AMD also chose to equip the Radeon VII with 16 GB of HBM2 memory, doubling it from the previous generation.
Much like the "Vega 10" that powers the RX Vega 64, the "Vega 20" GPU at the heart of the Radeon VII is a multi-chip module (MCM), a combination of the 7 nm GPU die, four 10 nm-class HBM2 memory stacks supplied by either SK Hynix or Samsung, and a silicon interposer on which the GPU and HBM2 stacks sit. The interposer enables high-density microscopic wiring between the GPU and memory stacks, while TSVs (through silicon vias) connect the GPU and memory stacks to the fiberglass package substrate underneath. The switch to 7 nm has reduced the die-size of the GPU from 495 mm² on the 14 nm "Vega 10" down to 331 mm², which isn't half, but one should realize that "Vega 20" isn't an optical-shrink as there are numerous physical changes to the die which we described above.
Barring small changes to the NGCUs, as lower-latency caches and additional accumulators for both the integer and floating-point sides, the number-crunching resources of "Vega 20" and the chip's hierarchy is essentially the same. The GPU physically features 64 NGCUs, although only 60 of them are enabled on the Radeon VII. This is probably done to increase yields/harvesting of the silicon. These 60 NGCUs amount to 3,840 stream processors and 240 TMUs. The ROP count is unchanged at 64, although AMD has increased the bandwidth of these ROPs. The memory bus width has doubled to 4096-bit, as has the memory amount at 16 GB.