Monday, August 31st 2020

Performance Slide of RTX 3090 Ampere Leaks, 100% RTX Performance Gain Over Turing

NVIDIA's performance expectations from the upcoming GeForce RTX 3090 "Ampere" flagship graphics card underline a massive RTX performance gain generation-over-generation. Measured at 4K UHD with DLSS enabled on both cards, the RTX 3090 is shown offering a 100% performance gain over the RTX 2080 Ti in "Minecraft RTX," greater than 100% gain in "Control," and close to 80% gain in "Wolfenstein: Young Blood." NVIDIA's GeForce "Ampere" architecture introduces second generation RTX, according to leaked Gainward specs sheets. This could entail not just higher numbers of ray-tracing machinery, but also higher IPC for the RT cores. The specs sheets also refer to third generation tensor cores, which could enhance DLSS performance.
Source: yuten0x (Twitter)
Add your own comment

131 Comments on Performance Slide of RTX 3090 Ampere Leaks, 100% RTX Performance Gain Over Turing

#76
Mistral
OK, so yet another confirmation that RTX 20x0 series RTX functionality is simply an expensive paperweight.
Posted on Reply
#77
dragontamer5788
Shatun_Bear
No, the next gen console OS takes 2.5GB only, leaving 13.5GB purely for the GPU (on Series X the memory is split, but it's still 13.5GB for games), which again, is more than the paltry 10GB on this 3080.

And you can already push past 8GB today, it's going to get much worse once the new consoles are out and they drop PS4/XB1 for multiplatform titles.
Microsoft Flight Sim takes up 24+GBs of system DDR4 RAM, but less than 8GBs of GDDR6 VRAM.

Its weird because consoles don't have a DDR4 vs GDDR6 split. But when a game gets ported over to the PC, the developers will have to make a decision over what stays in DDR4, and what stays in VRAM. Most people will have 8GB of DDR4, maybe 16GBs, in addition to the 8GB+s of VRAM on a typical PC.

Your assumption that everything will be stored in VRAM and VRAM only is... strange and wacky. I don't really know how to argue against it aside from saying "no. It doesn't work like that".
Posted on Reply
#78
zlobby
Ravenmaster
i bet the price won't be
When was it ever, to begin with?
Posted on Reply
#79
lexluthermiester
Zubasa
Note that all 3 games in the chart are running RTX, so this is just raytracing performance, not general rasterization.
Also this is running with DLSS on, and no idea if Ampere has any performance gains in DLSS over Turing either.
Have to agree here. The bigger picture has yet to be shown while a full suite of testing. Simply showing one limited set of tests, while still very eye opening and interesting, can not grant us the information we need to understand the potential of the new lineup.

NDA is in effect, but it stands to reason that most reviewers(many are hinting at this) have samples with retail ready(or very near ready) drivers. I would be willing to bet @W1zzard has at least one that will be reviewed on release.
B-Real
Who the heck cares about RT?
You mean RTRT? And almost everyone does, which is why AMD has jumped on the RTRT bandwagon with their new GPUs are said to have full RTRT support for both consoles and PC.
RedelZaVedno
Is it worth the price increase over Maxwell/Pascal? NO
Opinion that most disagree with. Don't like it? Don't buy it.
Metroid
So there we have, as soon as I saw the image nvidia comparing 2080ti x 3090 then it became clear, 3090 was meant to be 3080ti but for some reason they wanted a different name but in truth 3090 = 3080ti.
That would seem logical. NVidia likes to play with their naming conventions. They always have.
Mistral
OK, so yet another confirmation that RTX 20x0 series RTX functionality is simply an expensive paperweight.
Fully disagree with this. My 2080 has been worth the money spent. The non-RTRT performance has been exceptional over the 1080 I had before. The RTRT features have been a sight to behold. Quake2RTX was just amazing. Everything else has been beautiful. Even if non-RTRT performance is only 50% better on a per-tier basis, as long as the prices are reasonable, the upgrade will be worth it.
Posted on Reply
#80
Minus Infinity
Not that impressive at all, it needed to be more like 200% to really make it useful. And this means the lesser models will have much smaller improvements.
Posted on Reply
#81
watzupken
Zubasa
Note that all 3 games in the chart are running RTX, so this is just raytracing performance, not general rasterization.
Also this is running with DLSS on, and no idea if Ampere has any performance gains in DLSS over Turing either.
I think this is to be expected. I am pretty sure that Nvidia dedicated a lot of die space to RT and Tensor cores to bump performance up in these areas. If we are looking at actual no RT and DLSS performance, the numbers may not be that impressive considering the move from 12nm (essentially a refined TSMC 16nm) to 7nm is a full node improvement.
Posted on Reply
#82
lexluthermiester
Minus Infinity
Not that impressive at all, it needed to be more like 200% to really make it useful.
Clearly you don't have an RTX card or you would not utter such silly nonsense.
Minus Infinity
And this means the lesser models will have much smaller improvements.
As they haven't been tested fully yet, you can't know that.
Posted on Reply
#83
watzupken
lexluthermiester
Fully disagree with this. My 2080 has been worth the money spent. The non-RTRT performance has been exceptional over the 1080 I had before. The RTRT features have been a sight to behold. Quake2RTX was just amazing. Everything else has been beautiful. Even if non-RTRT performance is only 50% better on a per-tier basis, as long as the prices are reasonable, the upgrade will be worth it.
The RTX 2080 is a good GPU, though marred by a high asking price due to the lack of competition. As for RT, I agree it is a visual treat. However while you are able to improve image quality by enabling RT, you generally lose sharpness due to the need to run at a lower resolution. While DLSS is here for this reason, but version 1.0 was a mess. 2.0 is a lot better but still not widely available and not sure how well it will work for games that are not as well optimized. Most of the time, we see how "great" DLSS 2.0 is due to selected titles where Nvidia worked very closely with the game developers. To me, its a proof of concept, but if game developers are left to optimized it themselves, the results may not be that great.
Minus Infinity
Not that impressive at all, it needed to be more like 200% to really make it useful. And this means the lesser models will have much smaller improvements.
I've never seen a 200% improvement in graphic performance moving from 1 gen to another so far. You may need to manage your expectations here.

As for "lesser" models, I believe you mean the likes of RTX xx70 and xx60 series. If so, it may be too premature to make that conclusion. In fact, I feel it is usually the mid end range that gets the bigger bump in performance because this is where it gets very competitive and sells the most. Consider the last gen where the RTX 2070 had a huge jump in performance over the GTX 1070 that it is replacing. The jump in performance is significant enough to bring the RTX 2070 close to the performance of a GTX 1080 Ti. The subsequent Super refresh basically allow it to outperform the GTX 1080 Ti in almost all games. So if Nvidia is to introduce the RTX 2070 with around the same CUDA cores as the RTX 2080, the improved clockspeed and IPC should give it a significant boost in performance, along with the improved RT and Tensor cores.
Posted on Reply
#84
lexluthermiester
watzupken
you generally lose sharpness due to the need to run at a lower resolution.
How so? I run my card on dual 1440p displays and run games generally at the native res. However 1080p is a perfectly acceptable res to run at.
watzupken
While DLSS is here for this reason, but version 1.0 was a mess. 2.0 is a lot better but still not widely available and not sure how well it will work for games that are not as well optimized. Most of the time, we see how "great" DLSS 2.0 is due to selected titles where Nvidia worked very closely with the game developers.
I don't use it(disabled), so I couldn't care less what state it's in. And it'll stay disabled even on an RTX30xx card.
Posted on Reply
#85
dragontamer5788
watzupken
While DLSS is here for this reason
I know variable rate shading is a completely different feature than DLSS... but... I'm more hype about VRS instead. Its a similar "upscaling" effect, except it actually looks really good.

I don't own an NVidia GPU, but the DLSS samples I've been seeing weren't as impressive as VRS samples. NVidia, and Intel (iGPUs) implement VRS, so it'd be a much more widespread, and widely acceptable technique to reduce the resolution (ish) and minimize GPU compute, while still providing a sharper image where the player is likely to look.

AMD will likely provide VRS in the near future. Its implemented in PS5 and XBox SeX (XBSX? What's the correct shortcut for this console?? I don't want to be saying "Series X" all the time, that's too long)

-----------

In any case, methods like VRS will make all computations (including raytracing) easier for GPUs to calculate. As long as the 2x2, or 4x4 regions are carefully selected, the player won't even notice. (Ex: fast moving objects can be shaded at 4x4 and probably be off the screen before the player notices that they were rendered at a lower graphical setting, especially if everything else in the world was at full 4k or 1x1 rate).
To me, its a proof of concept, but if game developers are left to optimized it themselves, the results may not be that great.
Many game developers are pushing VRS, due to PS5 / XBox support.

It seems highly unlikely that tensor cores for DLSS would be widely implemented in anything aside from NVidia's machine-learning based GPUs. DLSS is a solution looking for a problem: NVidia knows they have tensor cores and they want the GPU to use them somehow. But they don't really make the image look much better, or save much compute. (Those deep learning cores need a heck of a lot of multiplications to work, ... and these special FP16 multiplications that only are accelerated on special tensor cores).
Posted on Reply
#86
steen
lexluthermiester
NDA is in effect, but it stands to reason that most reviewers(many are hinting at this) have samples with retail ready(or very near ready) drivers. I would be willing to bet @W1zzard has at least one that will be reviewed on release.
Links to NDA drivers went live over the weekend. There was a non-signed driver that also worked for some, but there have been white/black listed drivers for some time.
Posted on Reply
#87
InVasMani
Far less interested in the RTX/DLSS performance I want to see it's pure raw rasterization performance because that's how about 99% of games run while those other two things might be selectively accessible in a handful of newer titles. I'm far more keen on the standard usage performance.
Posted on Reply
#88
enya64
How soon before we get the 3090TI? Next March? Or as soon as we find out AMD can compete with the 3090 in game reviews? Will it be called the 3090TI or the 3090 Super?
Posted on Reply
#89
lexluthermiester
InVasMani
Far less interested in the RTX/DLSS performance I want to see it's pure raw rasterization performance because that's how about 99% of games run while those other two things might be selectively accessible in a handful of newer titles. I'm far more keen on the standard usage performance.
Those performance metrics are coming. I believe the NDA lifts tomorrow or tuesday, so there isn't long to wait.
Posted on Reply
#90
Valantar
xkm1948
Based on TPU review of Control. RTX on will cut 2080Ti performance to half in 1080p

www.techpowerup.com/review/control-benchmark-test-performance-nvidia-rtx/4.html

Without DLSS, 2080Ti gets about 100FPS avg at 1080p, 50fps avg at RTX 1080P.

So now RTX3090 gets about 100 fps on RTX 1080p (>2X from this chart)

That means RTX3090 needs at least 200FPS on regulation 1080p, that is without RTX. Of course this is assuming no RT efficiency improvement. Let's assume there is some major RT effieceny improvement so instead of 0.5X performance penalty we have 0.7X performance penalty. Then RTX3090 would be running 133FPS to 150FPS without RTX. So a 30% to 50% performance uplift in non-RTX games. Also we have 5248 versus 4352 CUDA core. So CUDA core increase by itself should give at least 20% performance in nonRTX

That is just some quick napkin math. I am more leaning towards 30%~35% performance increase. But there is a good chance that I am wrong
This is a fundamental misunderstanding of how RT performance relates to rasterization performance. An increase in RT performance doesn't necessarily indicate any increase in rasterization performance whatsoever. Don't get me wrong, there will undoubtedly be one, but it doesn't work the way you are sketching out here.
Shatun_Bear
You dont need to damage control for them.

I'm saying 10GB on a $800 graphics card is a joke. With that kind of money you should expect to use it for 3-4 years, 10gb wont be enough when both new consoles have 16gb, RDNA2 will have 16gb cards for the same price or likely cheaper, and Nvidia will nickle and dime everyone with a 20gb model 3080 next month. The only people buying a 10gb $800 graphics card are pretty clueless ones.
No games actually use that much VRAM, and with more modern texture loading techniques (such as MS bringing parts of DirectStorage to Windows) VRAM needs are likely to stand still if not drop in the future. Also, the 16GB of system memory on the consoles are (at least on the XSX) split into 2.5GB OS, 3.5GB medium bandwidth, and 10GB high bandwidth "GPU optimized". Given that the non-graphics parts of the game will need some memory, we are highly unlikely to see games meant for those consoles exceed 10GB VRAM at 4k (they might of course at higher settings).
Searing
Yeah we all know RT and DLSS is nonsense meant to find a use for datacenter hardware and mask bad performance gains. There are many better ways to get nice visuals. Look at all the upcoming PS5 games like Rachet and Clank (where RT is basically off and no DLSS).
What? Ratchet & Clank is chock-full of RT reflections, on glass, floors, Clank's body, etc.
Shatun_Bear
No, the next gen console OS takes 2.5GB only, leaving 13.5GB purely for the GPU (on Series X the memory is split, but it's still 13.5GB for games), which again, is more than the paltry 10GB on this 3080.

And you can already push past 8GB today, it's going to get much worse once the new consoles are out and they drop PS4/XB1 for multiplatform titles.
Entirely agree. Also, as I noted above, of those 13.5GB, some memory needs to be used for the game code, not just graphics assets. For the XSX they have cut some costs here by making 3.5GB of the game memory pool lower bandwidth, which is a pretty clear indication of how they envision the worst-case scenario split between game code and VRAM for this console generation. I sincerely doubt we'll see any games using 12GB VRAM with just 1.5GB system memory ...
Posted on Reply
#91
BoboOOZ
zlobby
When was it ever, to begin with?
A long time ago I bought a mighty Geforce 256 for 200 USD, and it was head and shoulders above the competition...
Posted on Reply
#92
bug
Mistral
OK, so yet another confirmation that RTX 20x0 series RTX functionality is simply an expensive paperweight.
It's a first taste of the tech. Since when does a first iteration get everything right? Was the first iPhone an expensive paperweight? Was 8086? 3dfx Voodoo?
Posted on Reply
#93
watzupken
dragontamer5788
Many game developers are pushing VRS, due to PS5 / XBox support.

It seems highly unlikely that tensor cores for DLSS would be widely implemented in anything aside from NVidia's machine-learning based GPUs. DLSS is a solution looking for a problem: NVidia knows they have tensor cores and they want the GPU to use them somehow. But they don't really make the image look much better, or save much compute. (Those deep learning cores need a heck of a lot of multiplications to work, ... and these special FP16 multiplications that only are accelerated on special tensor cores).
I am no tech person, but looking at how DLSS works, it appears that it requires quite a fair bit of investment from the game developers to make it work properly and affects time to market. And because it is proprietary to Nvidia, there is little motivation for a game developer to spend much time on it, unless Nvidia is willing to put in significant effort to help them optimize it. I don't think this is sustainable. I do agree that VRS seems to make more sense since future products from AMD and Intel should support VRS, and is universal.
Posted on Reply
#94
medi01
lexluthermiester
You mean RTRT? And almost everyone does, which is why AMD has jumped on the RTRT bandwagon with their new GPUs are said to have full RTRT support for both consoles and PC.
The same story as with VR. And where is VR now?
EPIC not using RT in Unreal 5 is quite telling.
Posted on Reply
#95
kiriakost
In the past 25 years every Nvidia card with 7% gain this becomes another VGA card entity (model).
I bet that very few have memories from the era of Nvidia TNT and the TNT2 and so on and on and on, and today GTX and soon RTX .
How soon we are going to see the VTX ? :p
Posted on Reply
#96
bug
watzupken
I am no tech person, but looking at how DLSS works, it appears that it requires quite a fair bit of investment from the game developers to make it work properly and affects time to market.
That's not how it works. Nvidia spends time to train some neural networks, the developer only has to make some specific API calls.
For DLSS 1.0, Nvidia had to train on a per-title basis. Past a certain point that was no longer necessary and now we have DLSS 2.0. I've even heard DLSS 3.0 may do away with the API specific calls, but I'll believe that when I see it (I don't doubt that would be the ideal implementation, but I have no idea how close/far Nvidia is from it).
Posted on Reply
#97
Searing
You see how people's minds work? I couldn't resist mentioning Ratchet and Clank since it is the first next gen only game coming. All those visual improvements are in rasterization, but because there is some inconsequential amount of RT in the game, somehow it is an RT showcase for silly people already. nVidia has already won the marketing battle for the silly people that is for sure. I believe there will be an RT off setting for 60fps, then we can compare after launch. Prepare to prefer the non RT version...
medi01
Or compare it to PS5 not using "RT features":




I call fake cause it's not NV style to have charts with 0 as the baseline.



And, frankly, I don't get what "oddities" of jpeg compression are supposed to indicate.
This kind of "leak" does not need to modify an existing image, it's plain text and bars.
Yeah that is another good example :)
Posted on Reply
#98
lexluthermiester
medi01
The same story as with VR. And where is VR now?
You really gonna go with that comparison? Weak, very weak.
medi01
EPIC not using RT in Unreal 5 is quite telling.
No, it isn't. It just says that they aren't developing RTRT yet. Why? Because they use that same engine on more that one platform, which means that engine has to have a common base code for Windows, Android and iOS. Two of those platforms are not and will not be capable of RTRT anytime soon, therefore Unreal5 does not support RTRT and it likely will not.
Posted on Reply
#99
medi01
lexluthermiester
You really gonna go with that comparison? Weak, very weak.
I am going against "see, everyone's doing it, it's gotta be cool".
lexluthermiester
No, it isn't. It just says that they aren't developing RTRT yet. Why? Because they use that same engine on more that one platform, which means that engine has to have a common base code for Windows, Android and iOS. Two of those platforms are not and will not be capable of RTRT anytime soon, therefore Unreal5 does not support RTRT and it likely will not.
Why is quite telling too (and applies to any game developer, not just game engine developers)
The fact that they didn't need any "RTRT" yet deliver light effects that impressive was what I was referring to.
Posted on Reply
#100
bug
medi01
Why is quite telling too (and applies to any game developer, not just game engine developers)
The fact that they didn't need any "RTRT" yet deliver light effects that impressive was what I was referring to.
Stop that. Rasterization has many tricks and can look really, really god. But it can't do everything RT can (e.g. free ambient occlusion, off screen reflections).
Posted on Reply
Add your own comment