Thursday, April 11th 2019

NVIDIA Extends DirectX Raytracing (DXR) Support to Many GeForce GTX GPUs

NVIDIA today announced that it is extending DXR (DirectX Raytracing) support to several GeForce GTX graphics models beyond its GeForce RTX series. These include the GTX 1660 Ti, GTX 1660, GTX 1080 Ti, GTX 1080, GTX 1070 Ti, GTX 1070, and GTX 1060 6 GB. The GTX 1060 3 GB and lower "Pascal" models don't support DXR, nor do older generations of NVIDIA GPUs. NVIDIA has implemented real-time raytracing on GPUs without specialized components such as RT cores or tensor cores, by essentially implementing the rendering path through shaders, in this case, CUDA cores. DXR support will be added through a new GeForce graphics driver later today.

The GPU's CUDA cores now have to calculate BVR, intersection, reflection, and refraction. The GTX 16-series chips have an edge over "Pascal" despite lacking RT cores, as the "Turing" CUDA cores support concurrent INT and FP execution, allowing more work to be done per clock. NVIDIA in a detailed presentation listed out the kinds of real-time ray-tracing effects available by the DXR API, namely reflections, shadows, advanced reflections and shadows, ambient occlusion, global illumination (unbaked), and combinations of these. The company put out detailed performance numbers for a selection of GTX 10-series and GTX 16-series GPUs, and compared them to RTX 20-series SKUs that have specialized hardware for DXR.
Update: Article updated with additional test data from NVIDIA.

According to NVIDIA's numbers, GPUs without RTX are significantly slower than the RTX 20-series. No surprises here. But at 1440p, the resolution NVIDIA chose for these tests, you would need at least a GTX 1080 or GTX 1080 Ti for playable frame-rates (above 30 fps). This is especially true in case of Battlefield V, in which only the GTX 1080 Ti manages 30 fps. The gap between the GTX 1080 Ti and GTX 1080 is vast, with the latter serving up only 25 fps. The GTX 1070 and GTX 1060 6 GB spit out really fast Powerpoint presentations, at under 20 fps.
It's important to note here, that NVIDIA tested at the highest DXR settings for Battlefield V, and lowering the DXR Reflections quality could improve frame-rates, although we remain skeptical about the slower SKUs such as GTX 1070 and GTX 1060 6 GB. The story repeats with Shadow of the Tomb Raider, which uses DXR shadows, albeit the frame-rates are marginally higher than Battlefield V. You still need a GTX 1080 Ti for 34 fps.
Atomic Heart uses Advanced Reflections (reflections of reflections, and non-planar reflective surfaces). Unfortunately, no GeForce GTX card manages performance over 15.4 fps. The story repeats with 3DMark Port Royal, which uses both Advanced Reflections and DXR Shadows. Single-digit frame-rates for all GTX cards. The performance is better with Justice tech-demo, although far-from playable, as only the GTX 1080 and GTX 1080 Ti manage over 20 fps. Advanced Reflections and AO, in case of the Star Wars RTX tech-demo, is another torture for these GPUs - single-digit frame-rates all over. Global Illumination with Metro Exodus is another slog for these chips.
Overall, NVIDIA has managed to script the perfect advertisement for the RTX 20-series. Real-time ray-tracing on compute shaders is horrendously slow, and it pays to have specialized hardware such as RT cores for them, while tensor cores accelerate DLSS to improve performance even further.
It remains to be seen if AMD takes a swing at DXR on GCN stream processors any time soon. The company has already had a technical effort underway for years under Radeon Rays, and is reportedly working on DXR.

Update:
NVIDIA posted its test data for 4K and 1080p in addition to 1440p, and medium-thru-low settings of DXR. Their entire test data is posted below.

Add your own comment

111 Comments on NVIDIA Extends DirectX Raytracing (DXR) Support to Many GeForce GTX GPUs

#51
chrcoluk
I was expecting a much bigger gap to be honest, usually specialised hardware with optimised routines has much more than a 2-3x performance improvement. This RT was optimised for RTX and it only has a 200-300% advantage depending on game.

All this data has told me, is that if games get made with compute versions of RT optimised for it, then potentially performance can be quite close. I dont understand what nvidia are doing here, it seems out of desperation they enabled support for the large pascal userbase to try and entice developers. RT cards may be dead consumer tech within 1-2 generations.
Posted on Reply
#52
FordGT90Concept
"I go fast!1!11!1!"
I still think RT cores are basically lean Kepler (or similar architecture) cores that are reserved so they don't interfere with the graphics pipeline. If this is the case, GCN using async compute should be able to do very well at DXR without modification.

NVIDIA to date hasn't technically described what the RT cores are.

I guess we'll find out when AMD debuts DXR support.
Posted on Reply
#53
SoNic67
CrackongIt was the same thing again, hardware PhysX got 4 games back in 2008 and 7 in 2009, then hardware PhysX simply fade out and now open sourced.
PhysX games were plenty and the PhysX added to the game's realism. The fact that you didn't get to observe it doesn't change the facts.
Fallout 4 (FleX added in 2017), Witcher 3 family (2016), COD Ghosts (2013) really benefited from that - at least I personally liked the effects. I could't stop using grenades :)
Was it doomed by being closed source? Maybe. But that doesn't mean it didn't work. Unreal Engine 4 still uses it.

I would like an identical approach for RTX: Let me add another card and dedicate it to RTX. That will make me maybe take the bait to upgrade sooner to a single-card solution.
Posted on Reply
#54
londiste
chrcolukI was expecting a much bigger gap to be honest, usually specialised hardware with optimised routines has much more than a 2-3x performance improvement. This RT was optimised for RTX and it only has a 200-300% advantage depending on game.
All this data has told me, is that if games get made with compute versions of RT optimised for it, then potentially performance can be quite close.
The current hybrid solution means only a small part of frame rendering uses DXR. Even then, only specific operations are done on RT cores, data setup and management still happens on shaders.
Compare results from BF5 that uses little, Metro/SoTR that use little bit more and benchmarks like Port Royal or techdemos that use a lot of RT. The more RT is used the bigger the performance gap gets.
The other part is that Nvidia chose to put front and center results with DXR Low/Medium and modest resolutions. These paint Pascal in a better light than DXR High/Ultra results.

For a visual representation on what I am trying to say, look at the Metro Exodus frame graphs from Nvidia's original announcement, the middle part represents the part that RT Cores deal with:
www.techpowerup.com/253759/nvidia-to-enable-dxr-ray-tracing-on-gtx-10-and-16-series-gpus-in-april-drivers-update
www.techpowerup.com/img/Qr86CtLnbFWCRcfc.jpg
FordGT90ConceptNVIDIA to date hasn't technically described what the RT cores are.
They have not described the units very precisely. However, it is not quite correct to say we do not know what the RT Cores do. They run a couple operations for raytracing implemented in hardware. Anandtech's article has a pretty good overview:
www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/5
Posted on Reply
#55
Xzibit
londisteThey have not described the units very precisely. However, it is not quite correct to say we do not know what the RT Cores do. They run a couple operations for raytracing implemented in hardware. Anandtech's article has a pretty good overview:
www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/5
Its similar to what Imagination had a few years prior they just called it "RTU" Ray Tracing Unit along with a Scene Hierarchy Generator.
Posted on Reply
#56
FordGT90Concept
"I go fast!1!11!1!"
londisteThey have not described the units very precisely. However, it is not quite correct to say we do not know what the RT Cores do. They run a couple operations for raytracing implemented in hardware. Anandtech's article has a pretty good overview:
www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/5
Something like a Kepler core could be doing everything the "RT core" does.

Remember, RTX has an *extremely* limited capability to ray trace: it complements existing rendering techniques in games rather than replacing it.
Posted on Reply
#57
R-T-B
CrackongIf AMD come up with something let's say "FreeRay" which runs on and optimized for typical graphics cards instead of RTX cards, would that benefit RTX card sales ?
Yes it would, because it's literally impossible to do this in a performant way on standard non tensor gpus.

As RTX are the only gpus with tensor cores right now, it would run like shit, driving upgrades to RTX.

That is his point in a nutshell.
CrackongNvidia does not own DXR.
If AMD could come up with their own solution to optimize DXR without dedicated "cores", RTX cards would become utterly pointless.
They would need to add tensor cores first.
Posted on Reply
#58
FordGT90Concept
"I go fast!1!11!1!"
Tensor cores are quite useless for DXR. NVIDIA is using tensor cores to up-sample resolution to compensate for framerate loss due to DXR. If the RT cores weren't rubbish and/or the GPU could properly async (like GCN can) so they can raytrace without impacting framerate, DLSS would be useless. A proper raytracing ASIC could be the solution...assuming DXR is a problem worth solving which I don't believe it is. There would have to be a monumental jump in compute capabilities (as in, a ton of cheap performance to waste) to warrant pursuing DXR as a useful technology in games.
Posted on Reply
#59
Crackong
R-T-BYes it would, because it's literally impossible to do this in a performant way on standard non tensor gpus.

As RTX are the only gpus with tensor cores right now, it would run like shit, driving upgrades to RTX.
That is his point in a nutshell.
They would need to add tensor cores first.
The Tensor cores in RTX cards don't do DXR .

Nvidia described Tensor cores as "specialized execution units designed specifically for performing the tensor / matrix operations that are the core compute function used in Deep Learning " .

They have nothing to do with Ray Tracing.
Posted on Reply
#60
overvolted
If raytracing wasnt demonstrated running on a radeon card (and rather well) by a third party, none of this would be happening.
Full "Damage Control" mode.
Posted on Reply
#61
R-T-B
CrackongThey have nothing to do with Ray Tracing.
They do the "denoising" that enables raytracing to be possible on present hardware (we can't possibly push enough raw rays).

Thus, they have everything to do with it.
Posted on Reply
#62
Crackong
R-T-BThey do the "denoising" that enables raytracing to be possible on present hardware (we can't possibly push enough raw rays).

Thus, they have everything to do with it.
lol you are mixing things up.
RTRT doesn't require denoiser to work.
Denoiser is an after-effect added to the final image.
Posted on Reply
#63
SoNic67
Crackonglol you are mixing things up.
RTRT doesn't require denoiser to work.
Denoiser is an after-effect added to the final image.
Sure is not necessary... if you are able to ray trace all the rays. But the current generation of RTX are not.
They trace only a couple of rays per pixel and mix some textures in, hence they have to apply de-noise to that to make it look good.
I guess you didn't read the excellent article linked above? Here you go a quote:
Essentially, this style of ‘hybrid rendering’ is a lot less raytracing than one might imagine from the marketing material. Perhaps a blunt way to generalize might be: real time raytracing in Turing typically means only certain objects are being rendered with certain raytraced graphical effects, using a minimal amount of rays per pixel and/or only raytracing secondary rays, and using a lot of denoising filtering; anything more would affect performance too much.
Posted on Reply
#64
Crackong
SoNic67Sure is not necessary... if you are able to ray trace all the rays. But the current generation of RTX are not.
They trace only a couple of rays per pixel and mix some textures in, hence they have to apply de-noise to that to make it look good.
Yes, that is exactly the point.

Nvidia offers an AI-based de-noiser powered by tensor cores.
It will de-noise any image given, no matter it is an in-game image, or a photo.

If it is just the de-noiser which matters, then it is the de-noiser, NOT the Tensor cores.
If AMD could come up with an efficient de-noise method without any dedicated hardware, Tensor cores also become utterly pointless.
Posted on Reply
#65
medi01
FordGT90ConceptSomething like a Kepler core could be doing everything the "RT core" does.

Remember, RTX has an *extremely* limited capability to ray trace: it complements existing rendering techniques in games rather than replacing it.
Well, but what about "fully RT Quake 3"?
Of course, models that are rendered there are simple, but still.
londisteAnandtech's article has a pretty good overview:
www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/5
This boils down to "it does in RT cores what DXR API is about" namely, intersection matching.
Uh, who would have thought.
Posted on Reply
#66
R0H1T
CrackongYes, that is exactly the point.

Nvidia offers an AI-based de-noiser powered by tensor cores.
It will de-noise any image given, no matter it is an in-game image, or a photo.

If it is just the de-noiser which matters, then it is the de-noiser, NOT the Tensor cores.
If AMD could come up with an efficient de-noise method without any dedicated hardware, Tensor cores also become utterly pointless.
So what happens with Pascal, I'm guessing lack of tensor cores isn't the (biggest) reason why RT tanks the performance on that thing :wtf:
Posted on Reply
#67
Crackong
R0H1TSo what happens with Pascal, I'm guessing lack of tensor cores isn't the (biggest) reason why RT tanks the performance on that thing :wtf:
Only the leather jacket himself knows.
Without any comparison data from the red team, we have no idea if the pascal cards received optimization for RTRT , or no optimization at all.

After all, Nvidia naturally wants to sell more Turing cards, optimize old pascal cards for the selling feature of Turing is the exact opposite of that.
Posted on Reply
#68
FordGT90Concept
"I go fast!1!11!1!"
medi01Well, but what about "fully RT Quake 3"?
Of course, models that are rendered there are simple, but still.
Judging by AAA games, publishers aren't willing to sacrifice so much for raytracing. On that note, if raytracing were more accessible, indie developers would probably use it because a lot of them go for a minimalist graphics style anyway.
Posted on Reply
#69
rtwjunkie
PC Gaming Enthusiast
FordGT90ConceptJudging by AAA games, publishers aren't willing to sacrifice so much for raytracing. On that note, if raytracing were more accessible, indie developers would probably use it because a lot of them go for a minimalist graphics style anyway.
Funny you should mention that. The PC game Abducted is adding in raytracing for those that have the hardware in one of its soon to be released early access patches. The game has been EA for three years and is almost ready. The dev just announced a couple weeks ago that RT would be added before final release. This game, btw, is not using minimalist graphics.
Posted on Reply
#70
FordGT90Concept
"I go fast!1!11!1!"
Then I have to assume that NVIDIA probably gave the devs RTX cards on the condition that they add RTX support.

It's way too early for indie devs to be adding raytracing as a cost-saving measure.
Posted on Reply
#71
rtwjunkie
PC Gaming Enthusiast
FordGT90ConceptThen I have to assume that NVIDIA probably gave the devs RTX cards on the condition that they add RTX support.

It's way too early for indie devs to be adding raytracing as a cost-saving measure.
I think you’re right. I don’t think it is a cost saving measure. I think they are just trying to offer it as a nice perk. They are very responsive and I think they really just want to make the best product they can.
Posted on Reply
#72
londiste
FordGT90ConceptSomething like a Kepler core could be doing everything the "RT core" does.
Remember, RTX has an *extremely* limited capability to ray trace: it complements existing rendering techniques in games rather than replacing it.
It could but would be rather inefficient at it. Turing SM is faster than Pascal SM in practically every aspect. Pascal is faster than Maxwell which is faster than Kepler. If Nvidia would think RT is best done on good old shaders, they would simply add more shader units and would not bother with RT Cores.
FordGT90ConceptTensor cores are quite useless for DXR. NVIDIA is using tensor cores to up-sample resolution to compensate for framerate loss due to DXR.
I am not too sure this is exactly the case here. True, they have no real purpose for RT calculations themselves. However, Nvidia has claimed (or at least did so initially) that their denoising algorithm runs on Tensor cores. This is definitely not the only denoising algorithm and maybe/likely not the best.
FordGT90ConceptJudging by AAA games, publishers aren't willing to sacrifice so much for raytracing. On that note, if raytracing were more accessible, indie developers would probably use it because a lot of them go for a minimalist graphics style anyway.
This is exactly what Nvidia has been going after. They say there are potential cost savings from reduced time in workflow of creating a game. Less workarounds, less artist/designer work. Some developers have supported that claim so they might actually have a point. Making raytracing more accessible is exaclty what DXR support for GTX cards is about.
Posted on Reply
#73
FordGT90Concept
"I go fast!1!11!1!"
londisteIt could but would be rather inefficient at it. Turing SM is faster than Pascal SM in practically every aspect. Pascal is faster than Maxwell which is faster than Kepler. If Nvidia would think RT is best done on good old shaders, they would simply add more shader units and would not bother with RT Cores.
All I know is that the organization of RT cores in Turing doesn't make sense.
londisteThis is exactly what Nvidia has been going after. They say there are potential cost savings from reduced time in workflow of creating a game. Less workarounds, less artist/designer work. Some developers have supported that claim so they might actually have a point. Making raytracing more accessible is exaclty what DXR support for GTX cards is about.
That is only true if the game was coded from the ground up to exclusively use raytracing. If there was any time put into traditional lighting/rendering techniques then raytracing is added cost.
Posted on Reply
#74
medi01
FordGT90ConceptAll I know is that the organization of RT cores in Turing doesn't make sense.
Could you elaborate?
Posted on Reply
#75
R-T-B
CrackongIf AMD could come up with an efficient de-noise method without any dedicated hardware
Yeah, if you have any technical knowledge at all you don't have to guess about this. It simply can't happen with present hardware. Unless we are talking about things like Quake 3...
Posted on Reply
Add your own comment
May 26th, 2022 18:08 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts