• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce RTX 5090 3DMark Performance Reveals Impressive Improvements

Absolute nonsense. People whine and moan about prices no matter what they are.
There will always be some exceptions but people are mostly disappointed because the performance uplift is not great 30-40% (4090 -> 5090) compared to 60-70% (3090 -> 4090) compared to last generation, whereas the price increase is 25% ($1600 -> $2000) vs ~7% last gen ($1500 -> $1600)
 
There will always be some exceptions but people are mostly disappointed because the performance uplift is not great 30-40% (4090 -> 5090) compared to 60-70% (3090 -> 4090) compared to last generation, whereas the price increase is 25% ($1600 -> $2000) vs ~7% last gen ($1500 -> $1600)
You know what inflation is, yes? You know how bad is has become during this post-pandemic recovery period? I'm not trying to justify NVidia's pricing, only that the increase is much less than all the other prices that have gone up. For example, food stuffs where I live had nearly doubled in the last 2.5years. Vehicle prices have nearly doubled as well. NVidia's prices have not doubled, not even close.
 
You know what inflation is, yes? You know how bad is has become during this post-pandemic recovery period? I'm not trying to justify NVidia's pricing, only that the increase is much less than all the other prices that have gone up. For example, food stuffs where I live had nearly doubled in the last 2.5years. Vehicle prices have nearly doubled as well. NVidia's prices have not doubled, not even close.
We all know what inflation happened it between but if they raised the price by $100 like the on the 4090 it would be perfectly fine, but 25% is a lot for a GeForce GPU ! Mostly for 30-40% more performance.
Nvidia are milking Pros with their AI chips and Quadro GPUs, but $2000 for a Gaming GPU is a lot. I can definitely buy one but it's not cheap for sure. And that's the only true 4K card of the RTX 50 series. The 5080 is not even beating the 4090...
 
On this we agree, however we don't set the prices. And the 5090 is the premium card for this gen and nothing beats it or even comes close, unless the rumors of the 5090ti/RTX Titan are true.
Since Nvidia have no competition I hardly believe we will not have a 5090 Ti... unless RDNA5/UDNA gets very close to the 5090 and is sold around $1000-$1200 but I doubt it.

$400 more is too much honestly. In 2022 we had just got out of the pandemic and there were some chip shortages still but the 4090 was $100 more even though TSMC 4N was much more expensive than Samsung 8nm and the 4090 had 52% more CUDA Cores than the 3090 Ti and 60% more than the 3090 and they increased the L2 Cache from 6MB to 72MB too, so I think the 4090 was actually more expensive to manufacture than the 5090 back then, but RDNA3 was supposed to be very competitive so Nvidia had to go almost all-in. The big plus of the 5090 are the 512-bit bus, GDDR7 memory and the 2-slot cooler but I don't think it's more expensive than the 4090 was back then.
 
Since Nvidia have no competition I hardly believe we will not have a 5090 Ti... unless RDNA5/UDNA gets very close to the 5090 and is sold around $1000-$1200 but I doubt it.

$400 more is too much honestly. In 2022 we had just got out of the pandemic and there were some chip shortages still but the 4090 was $100 more even though TSMC 4N was much more expensive than Samsung 8nm and the 4090 had 52% more CUDA Cores than the 3090 Ti and 60% more than the 3090 and they increased the L2 Cache from 6MB to 72MB too, so I think the 4090 was actually more expensive to manufacture than the 5090 back then, but RDNA3 was supposed to be very competitive so Nvidia had to go almost all-in. The big plus of the 5090 are the 512-bit bus, GDDR7 memory and the 2-slot cooler but I don't think it's more expensive than the 4090 was back then.
But the chip is maxxed out, is it not? They can't clock it higher either due to power and heat... So where is this extra performance going to come from?
 
Last edited:
But the chip is maxxed out, is it not?
No. The 5090 dies is using less than the max available specs for the full die utilization. This is the die wafer yield binning thing paying into the specs table. Few dies that come off a completed wafer are perfect. When 75%+ have flaws you have to do something with the imperfect dies. They sorted by functionality and quality of functionality and grouped into product class. The 5090 dies are near perfect but not quite. A full die would have(if I've done the counts right) 24,000 CUDA cores, 180 RT cores, 720 Tensor cores, 720 TMUs, and 192 ROPs instead of the 21,760 CUDA cores, 170 RT cores, 680 Tensor cores, 680 TMUs, and 176 ROPs is has. A 5090ti would have much more enabled and viable.

So to give you the short answer, it's not a maxxed out die, there is room for improvement so long as TSMC can refine the process enough to make more perfect wafers.
 
But the chip is maxxed out, is it not? They can't clock it higher either due to power and heat... So where is this extra performance going to come from?
They just need 2x 16-pin connectors, and a 3-slot or 4-slot cooler like 3090/4090 ! They could also go with an AIO (Water-Cooling) but I doubt they'll do that for now.

But the chip is maxxed out, is it not? They can't clock it higher either due to power and heat... So where is this extra performance going to come from?
No it has 170SM out of 192SM and lower clocks than 4090 due to the increased power consumption (575W TDP) and only 1x 16-pin connector limiting the max power draw to 600W (even though it could reach up to 675W with the PCIe slot...).
 
No. The 5090 dies is using less than the max available specs for the full die utilization. This is the die wafer yield binning thing paying into the specs table. Few dies that come off a completed wafer are perfect. When 75%+ have flaws you have to do something with the imperfect dies. They sorted by functionality and quality of functionality and grouped into product class. The 5090 dies are near perfect but not quite. A full die would have(if I've done the counts right) 24,000 CUDA cores, 180 RT cores, 720 Tensor cores, 720 TMUs, and 192 ROPs instead of the 21,760 CUDA cores, 170 RT cores, 680 Tensor cores, 680 TMUs, and 176 ROPs is has. A 5090ti would have much more enabled and viable.

So to give you the short answer, it's not a maxxed out die, there is room for improvement so long as TSMC can refine the process enough to make more perfect wafers.
But I thought the full die was only used in their Pro A.I. cards? Would nv ever sell those full dies for a quarter of the price to gamers? I can't see them doing that until the 60x0 series comes out.

They just need 2x 16-pin connectors, and a 3-slot or 4-slot cooler like 3090/4090 ! They could also go with an AIO (Water-Cooling) but I doubt they'll do that for now.


No it has 170SM out of 192SM and lower clocks than 4090 due to the increased power consumption (575W TDP) and only 1x 16-pin connector limiting the max power draw to 600W (even though it could reach up to 675W with the PCIe slot...).
I can't imagine nearly 800W of heat dissipating into a PC case just to play a game!
 
But I thought the full die was only used in their Pro A.I. cards? Would nv ever sell those full dies for a quarter of the price to gamers? I can't see them doing that until the 60x0 series comes out.


I can't imagine nearly 800W of heat dissipating into a PC case just to play a game!
AI chips are different, they also have FP64 CUDA Cores which the GeForce variants do not have, and they have HBM memory too!
Each GB200 GPU has 192GB of HBM3E with a 4096-bit bus and a Memory Bandwidth of 8TB/s, whereas the 5090 has a GB 202 chip with 32GB GDDR7 on a 512-bit bus and a memory Bandwidth of ~1.8TB/s.

On the other hand the QUADRO GPUs (made for Workstations and Professionals) will use the same GB202 chip as the 5090. But they will probably pack either 64GB or 96GB of GDDR7 and have a 190SM enabled (2 disabled for better yields).
 
But I thought the full die was only used in their Pro A.I. cards? Would nv ever sell those full dies for a quarter of the price to gamers? I can't see them doing that until the 60x0 series comes out.
Ada never had a full die being used. The highest end "quadro" had 2 SMs disabled.
Anyhow, if they have enough leftover chips that the enterprise is not in a rush to buy, they could sell those to consumers with a smaller profit, but I don't think it'll be the case.
AI chips are different, they also have FP64 CUDA Cores which the GeForce variants do not have, and they have HBM memory too!
Each GB200 GPU has 192GB of HBM3E with a 4096-bit bus and a Memory Bandwidth of 8TB/s, whereas the 5090 has a GB 202 chip with 32GB GDDR7 on a 512-bit bus and a memory Bandwidth of ~1.8TB/s.

On the other hand the QUADRO GPUs (made for Workstations and Professionals) will use the same GB202 chip as the 5090. But they will probably pack either 64GB or 96GB of GDDR7 and have a 190SM enabled (2 disabled for better yields).
The Quadro/Tesla GPUs are also AI chips, fwiw. Albeit less capable than a GB200, they're way cheaper and make for great inference devices, or even training if your model is not that big.

I guess it gets a bit ambiguous, since the highest-end chip (those x100/x200 ones) often have a somewhat different architecture from the rest of the lineup (with FP64 and HBM, as you said, and lacking on the RT cores and any other graphics-relevant components), but one could also mean "full die" as in the full GB202 die.
 
Ada never had a full die being used. The highest end "quadro" had 2 SMs disabled.
Anyhow, if they have enough leftover chips that the enterprise is not in a rush to buy, they could sell those to consumers with a smaller profit, but I don't think it'll be the case.

The Quadro/Tesla GPUs are also AI chips, fwiw. Albeit less capable than a GB200, they're way cheaper and make for great inference devices, or even training if your model is not that big.

I guess it gets a bit ambiguous, since the highest-end chip (those x100/x200 ones) often have a somewhat different architecture from the rest of the lineup (with FP64 and HBM, as you said, and lacking on the RT cores and any other graphics-relevant components), but one could also mean "full die" as in the full GB202 die.
Agree, even though Quadro are usually more used by people doing a lot of 3D Rendering whereas Tesla are more for AI performance. GB200 is another world yeah.
But I don't think we'll see a full GB202 or almost like 190SM (out of 192) on a Consumer GPU this generation. They want to keep the best chips for Workstations and AI and charge a Premium for it (~$7000+ for Quadro and GB200 are $30k-40k)
 
But I thought the full die was only used in their Pro A.I. cards?
That's a good question. I think the Pro cards have slightly different die designs but I could be wrong. I know the AI chips are different, but the "Quadro" types are still an unknown.
 
Last edited:
That's a good question. I think the Pro card have slightly different die designs but I could be wrong. I know the AI chips are different, but the "Quadro" types are still an unknown.
Quadro always use the Consumer variants and not the AI chips. But they usually have a up to a full die (or 2SM disabled like on the RTX 6000 Ada), have a lot more VRAM, and can sometimes be on a better node too (the Quadro on Ampere were using TSMC 7nm whereas Consumer variants were made on Samsung 8nm).
 
That is not correct, but I'm not going to debate the subject.
The RTX 6000 Ada and RTX A6000 do not even have FP64 cores. But if you're talking about the TITAN V then sure, but it was an exception and had its own architecture (Volta).
 
Agree, even though Quadro are usually more used by people doing a lot of 3D Rendering whereas Tesla are more for AI performance. GB200 is another world yeah.
It's more of Quadro go into workstations (so with display outputs, cooling that's more friendly to a tower case), whereas Tesla goes into servers (many even are passive).
Same chips, and often same configs, just different form factor for different uses.
But I don't think we'll see a full GB202 or almost like 190SM (out of 192) on a Consumer GPU this generation. They want to keep the best chips for Workstations and AI and charge a Premium for it (~$7000+ for Quadro and GB200 are $30k-40k)
Given how we didn't see this happening with the AD102 either, I totally agree.
(the Quadro on Ampere were using TSMC 7nm whereas Consumer variants were made on Samsung 8nm).
No, the only Ampere on TSMC 7 was the A100, which is not really comparable to the GA102 and below, all of which were on Samsung 8nm (be it quadro or tesla).
Anyhow, bar the x100 chips, yeah, the rest of the lineup shares the chips among the geforce/quadro/tesla lineup.
 
It's the exact same underlying chip, but with different binning.
At-least according to previous Nvidia practices, top bins go to Pro market leftovers to x90
some leaks point that x90 top bin dies will become 96GB Pro GPU for AI, clam shell design with 3gb memory modules.
 
It's the exact same underlying chip, but with different binning.
I'm not arguing this point to death. There ARE differences. One can not shoe-horn a Geforce driver to work with a Pro card with any INF mods. Likewise ViceVerse. They while they use the same arch, they are not die compatible.
 
At-least according to previous Nvidia practices, top bins go to Pro market leftovers to x90
some leaks point that x90 top bin dies will become 96GB Pro GPU for AI, clam shell design with 3gb memory modules.
Not really, bins go everywhere. The AD102 had bins that were worse than the 4090 and went to some pro products, such as the RTX 5000 and L20.
I'm not arguing this point to death. There ARE differences. One can not shoe-horn a Geforce driver to work with a Pro card with any INF mods. Likewise ViceVerse. They while they use the same arch, they are not die compatible.
Both use the exact same driver on Linux, there's no distinction between a "geforce" and a "tesla" driver.
You can even desolder the chip from one of those GPUs and solder into the board from another, this has been done more than once.
 
Both use the exact same driver on Linux, there's no distinction between a "geforce" and a "tesla" driver.
That is because the both NVs Linux drivers and open source drivers are unified and includes runtime code for all of NV's offerings. That is hardly a conclusive factor.
You can even desolder the chip from one of those GPUs and solder into the board from another, this has been done more than once.
Not with recent gen dies.
 
So now the benchmarks are out will we have a few people upgrading? I'd like to hear some real-world experiences with the new Nvidia hardware, being an AMD user and all :) See how the other side does it and all that.
 
That is because the both NVs Linux drivers and open source drivers are unified and includes runtime code for all of NV's offerings. That is hardly a conclusive factor.
I guess you're confusing some things, Nvidia has both a proprietary and an open source kernel module on linux, those are unified bar the x100 chips.
Given that it's also open source, you can easily see that there's no distinction among the different binnings of each chip, their HAL and other related functions make no distinction between those. Have those as examples:
Nouveau, the open source driver upstream driver, is pretty useless for most cases.

FWIW, even on windows you can add extra stuff from the quadro/tesla drivers into the geforce ones, and vice-versa. I've done so while helping the community with the vGPU mods for the Turing GPUs.
 
Back
Top