• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA RTX 4090 Doesn't Max-Out AD102, Ample Room Left for Future RTX 4090 Ti

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,872 (7.38/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The AD102 silicon on which NVIDIA's new flagship graphics card, the GeForce RTX 4090, is based, is a marvel of semiconductor engineering. Built on the 4 nm EUV (TSMC 4N) silicon fabrication process, the chip has a gargantuan transistor-count of 76.3 billion, a nearly 170% increase over the previous GA102, and a die-size of 608 mm², which is in fact smaller than the 628 mm² die-area of the GA102. This is thanks to TSMC 4N offering nearly thrice the transistor-density of the Samsung 8LPP node on which the GA102 is based.

The AD102 physically features 18,432 CUDA cores, 568 fourth-generation Tensor cores, and 142 third-generation RT cores. The streaming multiprocessors (SM) come with special components that enable the Shader Execution Reordering optimization, which has a significant performance impact on both raster- and ray traced graphics rendering performance. The silicon supports up to 24 GB of GDDR6X or up to 48 GB of GDDR6+ECC memory (the latter will be seen in the RTX Ada professional-visualization card), across a 384-bit wide memory bus. There are 568 TMUs, and a mammoth 192 ROPs on the silicon.



The RTX 4090 is carved out of this silicon by enabling 16,384 out of 18,432 CUDA cores. 512 out of 568 Tensor cores, 512 out of 568 TMUs, 128 out of 142 RT cores, and unless NVIDIA has touched the ROP count, it could remain at 192. The memory bus is maxed out, with 24 GB of 21 Gbps GDDR6X memory across the 384-bit bus-width. In creating the RTX 4090, NVIDIA has given itself a 10% headroom in the number-crunching machinery, from which to carve out future SKUs such as the possible RTX 4090 Ti. Until that SKU is needed in the product-stack, NVIDIA will use this 10% margin toward harvesting the AD102 silicon.

View at TechPowerUp Main Site
 
'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
 
RTX 4090 doesn't max-out Max TBP, ample room left for future 800 W power.
Good job, nVIDIA.
 
Last edited:
Is anyone surpised by this? At one point the Ti model was a max core GPU. Than it shifted to the Titan Model. Now the Titan is gone. The Ti is once again the highest core count.
 
I would want the rtx4000 to be at 3000 series with same performance and cosume 1/2 to 1/3 power
 
1000W just isn't enough

season 5 episode 10 GIF by SpongeBob SquarePants
 
'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
Agreed

RTX 4090 doesn't max-out Max TBP, ample room left for future 800 W power.
Good job, nVIDIA.
Damn dude - EVGA made 3090Ti liquid cooler then they didn't had any room left for future 4090ti :D - Thats why they said bye bye Nvidia
 
I feel even with a fully enabled chip, its not going to result in a significant improvement over a slightly gimped one. Increasing number of cores will end up with diminishing returns. And knowing Nvidia, they will likely increase power limit in their mid cycle refresh, plus a further price increase.

'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
I am not sure if its a good node, I.e. Samsung 8nm which is essentially 10nm. Compared to AMD, Nvidia seems to be doing well with a node disadvantage. But frankly, this may be attributed to better architecture than RDNA2. Considering the huge jump in specs and clocks peed on TSMC 4nm (5nm) I feel the Samsung node actually was holding Ampere's performance back.
 
I feel the Samsung node actually was holding Ampere's performance back.
Of course, every half wit knows this, but there is a strong following of GPU owners that is adamant Samsung's nodes 'are not bad at all'. The latest argument in favor of that was pushing the blame to GDDR6X for the monumental power consumption, never mind the fact clocks were lower than in 2016 on... TSMC...16nm ;)

And here we are seeing the same GDDR6X on TSMC with more memory alongside a smaller die with many more transistors at a relatively small increase of power budget :)
 
try to look surprised..
 
Looking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
 
Looking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
I believe greed is the key word here, as in ngreedia as we've seen nvidiot_central being called in the past. Let's see the reviews...
 
Looking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
maybe the node was not bad at all. The proper question is, does the architecture is good? Maybe what was not so great was the architecture itself and the node change is not gonna change it either.
 
The 3090 didnt max out GA102 either and we didn't get a 3090ti using a bigger die.
 
The 3090 didnt max out GA102 either and we didn't get a 3090ti using a bigger die.
It is not about bigger die since that would mean a different chip. 3090 has 2 Sm units disabled. 3090 has 82 SM units vs 84 for 3090ti. It is not about bigger die since that one is the same. You have more resources.
 
I meant to post this comment here:

There seems to be some ambiguity around the ROP count. Is there an official number yet?
 
Releasing the 4090ti a year from now, mere months before a 5xxx launch is just.....I don't know. I'd never buy an xx90ti if it doesn't come out near the launch day of the current gen
 
Releasing the 4090ti a year from now, mere months before a 5xxx launch is just.....I don't know. I'd never buy an xx90ti if it doesn't come out near the launch day of the current gen
Don’t get bogged down in model letters and numbers. Currently Ti cards are the year out refresh that graphics manufacturers have been doing for years. In the past, different model letters and numbers have been used such as Super and xx50 XT.

Refreshes require more mature manufacturing nodes and a build up of harvested dies that yield more or less silicon to be activated. It’s a way to sell as many chips as possible given the reality of defects and poor yields near the beginning of new product series.

As an aside, its also easier to make one complete chip and then lock otherwise functioning parts to create lower SKUs. This only works up to a point when the ‘dead’ or ‘locked’ silicon exceeds the portion of working parts of the chip at which point you manufacture a smaller ‘native’ chip.

Edit: Oh and sometimes later SKU refreshes are just added in response to competition product releases. A company might even save such responses from the beginning on purpose to see how the competition reacts.
 
Last edited:
It's 5nm "marketed" as "4nm". The people writing articles should be filtering the BS, not echoing it.
 
They need another card to milk us. The worst part is we used to have Titans that at least brought professional things to the mainstream.
 
If performance is 10-20 percent higher than anything that AMD has, then it will be branded as a Titan GPU. The reason the 30 series didn't have one is partially due to how close AMD was in performance. They will not risk the headline, "Titan loses"
 
Datacenters will get all the fully functional dies, gamers get the broken scraps.
 
Yields on something that big must be horrible.
 
but there is a strong following of GPU owners that is adamant Samsung's nodes 'are not bad at all'.
They weren't bad at all...

...on launch day.

This is how tech advancement works.
 
Back
Top