• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Details DLSS 4 Design: A Complete AI-Driven Rendering Technology

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,999 (1.07/day)
NVIDIA has published a research paper on DLSS version 4, its AI rendering technology for real-time graphics performance. The system integrates advancements in frame generation, ray reconstruction, and latency reduction. The flagship Multi-Frame Generation feature generates three additional frames for every native frame. The DLSS 4 later on brings the best looking frames to the user quickly to make is seem like a real rendering. At the core of DLSS 4 is a shift from convolutional neural networks to transformer models. These new AI architectures excel at capturing spatial-temporal dependencies, improving ray-traced affect quality by 30-50% according to NVIDIA's benchmarks. The technology processes each AI-generated frame in just 1 ms on RTX 5090 GPUs—significantly faster than the 3.25 ms required by DLSS 3. For competitive gaming, the new Reflex Frame Warp feature reduces input latency by up to 75%, achieving 14 ms in THE FINALS and under 3 ms in VALORANT, according to NVIDIA's own benchmarks.

DLSS 4's implementation leverages Blackwell-specific architecture capabilities, including FP8 tensor cores and fused CUDA kernels. The optimized pipeline incorporates vertical layer fusion and memory optimizations that keep computational overhead manageable despite using transformer models, which are twice as large as previous CNN implementations. This efficiency enables real-time performance even with the substantially more complex AI processing. The unified AI pipeline reduces manual tuning requirements for ray-traced effects, allowing studios to implement advanced path tracing across diverse hardware configurations. The design also addresses gaming challenges like interpolating fast-moving UI elements and particle effects and reducing artifacts in high-motion scenes. NVIDIA's hardware flip metering and Blackwell-induced display engine integration ensure precise frame pacing of newly generated frames for smooth, high-refresh-rate gaming, with accurate imagery.




To ensure DLSS works as intended and that the neural networks produce quality results, NVIDIA has used a secret weapon: a dedicated supercomputer that has been continuously improving DLSS for the past six years. The supercomputer's primary task involves analyzing failures in DLSS performance, such as ghosting, flickering, or blurriness across hundreds of games. When issues are identified, the system augments its training data sets with new examples of optimal graphics and challenging scenarios that DLSS needs to address. That way, DLSS learns what games look like and generates realistic frames like a game engine would, without any artifacts.

View at TechPowerUp Main Site | Source
 
DLSS has come a long way but when you compare the Reference to the Transformer model, you realize that there's still a lot of work to do!
 
The poll attached to this article is way too black or white. Do we think AI rendering pipleines are the way forward? No I think AI assisted rendering pipelines are the way forward. If I wanted an AI to hallucinate the entire game I wouldn't need to buy the game or a powerful GPU. I'd just buy an expensive AI accelerator and have it generate the game and graphics.
 
The poll attached to this article is way too black or white. Do we think AI rendering pipleines are the way forward? No I think AI assisted rendering pipelines are the way forward. If I wanted an AI to hallucinate the entire game I wouldn't need to buy the game or a powerful GPU. I'd just buy an expensive AI accelerator and have it generate the game and graphics.
Well, I've got bad news for you...
 
Well, I've got bad news for you...

GPUs aren't the only AI accelerators (you seem to be implying that they are). You can put together a very fast AI build with AMD CPUs at a fraction of the price.
 
GPUs aren't the only AI accelerators (you seem to be implying that they are). You can put together a very fast AI build with AMD CPUs at a fraction of the price.
Nah, I wasn't really implying anything other than it seems that is exactly the direction we are headed with PC graphics. Mostly just making a lame joke :(
 
While I appreciate the drive to push pixels, when it comes at the loss of fidelity it seems like a step back, for years perfection of AA and AF was of this highest importance, lighting through programmable shaders, tessellation to drive realistic textures, larger frame buffer to hold higher resolution textures, full 32 bit color pipelines, forms of color compression to decrease bandwidth requirements were at the forefront of innovation. Now we have reached the point where pushing the latest thing is costing us in $$$ to realize performance that is simulated like a fake fruit drink that almost takes like the real thing.
 
Nvidia says raytracing is the future but then ships cards (the entire 5000 series so far) where raster games gain more in frame rate and 1% lows than raytraced games. What's the future?

DLSS3 was 'better than native' and now they're showing how DLSS4-T is closer to reference images in select cases.

It's a funny message about the future.
 
At least they didn't use one of the worst TAA implementations as a reference, so you can see that the "native" is still superior.
 
At least they didn't use one of the worst TAA implementations as a reference, so you can see that the "native" is still superior.

but but but nvidia's motto is better than native resolutions?

Imho A.I pipeline render is the definition of polishing a turd, which doesn't change the fact that it's still a turd.
 
Last edited:
but but but nvidia's motto is better native resolutions?

Imho A.I pipeline render is the definition of polishing a turd, which doesn't change the fact that it's still a turd.
You got it.
UE5, TAA, and RT introduced a host of problems—atrocious performance, blurry visuals, ghosting, and more. Nvidia then stepped in, conveniently selling the "solution" to the very issues they had a hand in creating in the first place, wielding either a carrot or a stick depending on how you view their approach.
 
Hate on Nvidia all you want, AMD can see the light, and is following in their footsteps.
 
Hate on Nvidia all you want, AMD can see the light, and is following in their footsteps.
Business owners paying protection money to gangs because they can see the light.
 
While I appreciate the drive to push pixels, when it comes at the loss of fidelity it seems like a step back, for years perfection of AA and AF was of this highest importance, lighting through programmable shaders, tessellation to drive realistic textures, larger frame buffer to hold higher resolution textures, full 32 bit color pipelines, forms of color compression to decrease bandwidth requirements were at the forefront of innovation. Now we have reached the point where pushing the latest thing is costing us in $$$ to realize performance that is simulated like a fake fruit drink that almost takes like the real thing.

Agree 100%, it was much better back when they pushed every year to get the best of the best (Performance, Technologies, Efficiency, etc.) !

Unfortunately all those companies only care about maximizing profits now. They're not pushing the limits anymore and are just relying on AI to do all the work (that they don't want to do because they're lazy). Look at the Industry as a whole, there have been huge are layoffs everywhere since 2021 and it's still going. Also the people staying are working for 2-3 jobs at once but barely make more money than before, it's like "modern slavery" and AI is only going to replace more and more people anyway.
 
I expected DLSS 4 to be amazing, and it is. I expected FSR 4 to be worse than DLSS 3.x, but it's actually pretty darn good. Additionally, we can override versions from the driver-side, or use tools like OptiScaler to bridge from one vendor to another. Good times ahead.
 
Nvidia says raytracing is the future but then ships cards (the entire 5000 series so far) where raster games gain more in frame rate and 1% lows than raytraced games. What's the future?

DLSS3 was 'better than native' and now they're showing how DLSS4-T is closer to reference images in select cases.

It's a funny message about the future.

RTX 50s are just a Refresh of the RTX 40s with more AI stuff (MFG, Neural Textures, etc.) but the IPC has been almost the same since Ampere.
Lovelace was a lot better than Ampere due to a much better process node and a lot more CUDA Cores (at least for the 4090), much higher Clock speeds and a lot more L2 Cache. But in terms of raw performance clock for clock and core for core, Blackwell/Lovelace/Ampere seem to be pretty much on par.

I was expecting the RTX 50s to have much better RT/PT performance but no, I guess they're just keeping that for the RTX 6090 and even more GPU Generated Frames... :mad:

I expected DLSS 4 to be amazing, and it is. I expected FSR 4 to be worse than DLSS 3.x, but it's actually pretty darn good. Additionally, we can override versions from the driver-side, or use tools like OptiScaler to bridge from one vendor to another. Good times ahead.

As much as I hate to say this, AMD should have gone with some AI upscaling from the beginning, because now only RNDA 4 can use FSR 4 :'(
 
It's a noticeable improve from cnn overall, but still worse than native though much less immediately pronounced compared to before. Those are also still screenshot comparisons as well rather than like animated ones zoomed in for comparison which would be a bit better reflective of normal expectations.
 
man these images really do make this upscaling look like crap....better then native my ass
 
Why even call it a Geforce anymore since they are not Graphics Force Cards they are A.I. RTX cards.
 
Hate on Nvidia all you want, AMD can see the light, and is following in their footsteps.

You do realize AMD and Intel were busy with developing a bunch of this stuff way before Nvidia ever released any of it right?

I expected DLSS 4 to be amazing, and it is. I expected FSR 4 to be worse than DLSS 3.x, but it's actually pretty darn good. Additionally, we can override versions from the driver-side, or use tools like OptiScaler to bridge from one vendor to another. Good times ahead.

imagine if the cards were actually made properly so they could do RT without upscaling......now that would be something
 
You do realize AMD and Intel were busy with developing a bunch of this stuff way before Nvidia ever released any of it right?

Still, Nvidia did catch up and are even beating them... Intel invested a lot in RT and AI back in the days and now they're literally behind everyone lol.

imagine if the cards were actually made properly so they could do RT without upscaling......now that would be something

Unfortunately this time seems over :cry: Running games at Native 4K with PT would look amazing! Imagine 4K Path Tracing with MSAA or even SSAA :love:
 
Last edited:
'You will own nothing and be happy'
 
man these images really do make this upscaling look like crap....better then native my ass
The real question has always been - how much of a quality hit are you willing to take for a given performance boost.
 
imagine if the cards were actually made properly so they could do RT without upscaling......now that would be something

Define properly? It's not like they can conjure unlimited computing power out of thin air. The RTX 5090 is already unreasonably powerful, it's the first graphics card to break the single precision (FP32) 100 TFLOPS mark, rated at around ~105TF at its nominal clock speeds (it runs faster!). A half-rack IBM Blue Gene/Q supercomputer with 512 nodes installed (8192 cores) from the early 2010's still falls utterly short of its performance - and you can run that on your PC, off your common desktop power supply, too.

This stuff is unreal, it's that real time ray traced graphics at super high resolutions and super high frame rates is every bit of the "holy grail" that Jensen Huang calls it and then some. It's insanely, extremely advanced technology that will still take years to achieve.
 
Last edited:
Back
Top