- Sep 17, 2014
- 15,535 (5.98/day)
- The Washing Machine
|Processor||i7 8700k 4.6Ghz @ 1.24V|
|Motherboard||AsRock Fatal1ty K6 Z370|
|Cooling||beQuiet! Dark Rock Pro 3|
|Memory||16GB Corsair Vengeance LPX 3200/C16|
|Video Card(s)||MSI GTX 1080 Gaming X @ 2100/5500|
|Storage||Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD|
|Display(s)||Gigabyte G34QWC (3440x1440)|
|Case||Fractal Design Define C TG|
|Audio Device(s)||Situational :)|
|Power Supply||EVGA G2 750W|
|Mouse||Logitech G502 Protheus Spectrum|
|Keyboard||Lenovo Thinkpad Trackpoint II (Best K/B ever... <3)|
The density of different blocks on a chip varies a lot.So tsmc's 7nm is 91mm but got 51mm on rx 6900xt. So tsmc's scalling of sram and logic is lesser than advertised ?
Great insights, thank you.As you well understood, my post was deliberately ignoring Nvidia there
There is a reddit post where detailed Turing die shots were analyzed. What he came up seems to be correct enough, Tensor cores and FP16 capabilities may be more nuanced but RT Cores are distinguishable and straightforward. RT Cores make up about 6% of TPC and about 3% of total die size. The increase for Tensor cores and/or FP16 capability concurrent to FP32 has more/most uses outside RT, same for cache. Implementation for AMD and Intel should not be too much different in terms of transistors and area cost, possibly less.
I wish there were good/readable enough die shots for RDNA2 and Ampere but apparently not so far. Would also need comparisons without RT and in case of RDNA where RT capability is part of some other block (TMU?) it is probably impossible to read.
3090 is on Samsung 8N, 6900XT is on TSMC N7:
- 3090 die is 28.3B transistors on 628 mm² - 45 MTr/mm²
- 6900XT die is 26.8B transistors on 520 mm² - 51 MTr/mm²
This highlights the differences in manufacturing processes more than anything.
In terms of transistors/area cost of latest improvements RDNA2 has huge amount of transistors (at least 6.4B plus some control logic which is 24% of total transistors) in Infinity cache, Ampere no doubt has a lot of transistors in the doubled ALUs in shaders.
More cache has been the go-to improvement for a few generations before RDNA and Turing. More likely than not adding more and more cache (at different levels) would happen with or without RT.
Assuming similar transistor density as 6900XT, 3090 on N7 would be 5.5% larger, about 30 mm².
That assumption is obviously suspect though. Without Infinity Cache 6900XT die would be noticeably less dense. On the other hand, there is A100 on TSMC's N7 with 54.2B transistors and 826mm² making the density out to 65,6 MTr/mm².