Monday, May 5th 2025

AMD Patents Provide Early UDNA Insights - "Blackwell-esque" Ray Tracing Performance Could be Achievable
Last September, AMD leadership publicly revealed UDNA—an "unforking" of previously separate enterprise and commercial GPU branches. Not long after this announcement, TechPowerUp's resident Serbian correspondent—AleksandarK—sat down with Team Red's Andrej Zdravkovic. The Chief Software Officer (and SVP) stated that a fair chunk of UDNA-related development work would be done by local engineers. Zdravkovic discussed this technology's eventual deployment in futuristic "AI PCs," but gamers have been salivating at the prospect of a proper successor to RDNA 4. A next-gen graphics architecture seeker—MrMPFR—has combed through official documents for any sign of UDNA preview material. The noted /Hardware subreddit member managed to distill their initial (very long) set of findings into an "easily digestible overview." They stated that this was just a small case of: "reporting and a little analysis on AMD's publicly available US patents filings," and other public-facing resources/archives.
Gleaned information included: "finalized architectural characteristics in future RDNA generations, AMD DXR IHV stacks (driver agnostic), and AMD sponsored titles. But please take everything with a grain of salt given my lack of professional expertise and experience with Real-time ray tracing (RTRT)". MrMPFR believes that Team Red started picking up former NVIDIA and Intel engineering talent, back in 2022/2023. In addition, a lot of new hires were apparently sourced from academic institutions. In theory, these newer team members have not had the time to make major inroads—in terms of getting finalized products out into the wild. MrMPFR reckons that noticeable contributions will accelerate AMD's making of "RDNA 6+/UDNA 2+," and beyond. Early 2025 leaks have pointed to the company collaborating with Sony; their "PlayStation 6" console is tipped to be powered by some fork of Team Red's "UDNA" graphics technology.MrMPFR unearthed a wealth of intended ray tracing performance improvements—a key goal being the superior usage of BVH (Bounding Volume Hierarchy) management. The keen industry observer shared the benefits of this feature: "(BVH) allows for lower CPU overhead and VRAM footprint, higher graphical fidelity, and interactive game worlds with ray traced animated geometry (assets and characters) and destructible environments on a mass scale...the patent filings cover smarter BVH management to reduce the BVH construction overhead and storage size and even increasing performance with many of the filings, likely an attempt to match or possibly even exceed the capabilities of (NVIDIA's) RTX Mega Geometry (tech)."
Their TL;DR part #2 section compared forthcoming developments with a familiar rival current-gen solution: "the patents indicate a strong possibility of almost feature level parity with NVIDIA Blackwell in AMD's future GPU architectures likely as soon as 'RDNA 5/UDNA' based on the filing dates. We might even see RT perf parity with Blackwell at iso-raster perf, that's an identical FPS drop percentagewise between architectures...If more architectural changes make their way into next-gen RDNA than those afforded by the current publicly available patent filings then it is very likely to exceed NVIDIA Blackwell on all fronts, except likely only matching ReSTIR path tracing (PT) and RTX Mega Geometry functionality. If this is true then that would be AMD's 'Maxwell moment' but for RT."
Sources:
/Hardware Subreddit, Wccftech, TweakTown, XDA Developers
Gleaned information included: "finalized architectural characteristics in future RDNA generations, AMD DXR IHV stacks (driver agnostic), and AMD sponsored titles. But please take everything with a grain of salt given my lack of professional expertise and experience with Real-time ray tracing (RTRT)". MrMPFR believes that Team Red started picking up former NVIDIA and Intel engineering talent, back in 2022/2023. In addition, a lot of new hires were apparently sourced from academic institutions. In theory, these newer team members have not had the time to make major inroads—in terms of getting finalized products out into the wild. MrMPFR reckons that noticeable contributions will accelerate AMD's making of "RDNA 6+/UDNA 2+," and beyond. Early 2025 leaks have pointed to the company collaborating with Sony; their "PlayStation 6" console is tipped to be powered by some fork of Team Red's "UDNA" graphics technology.MrMPFR unearthed a wealth of intended ray tracing performance improvements—a key goal being the superior usage of BVH (Bounding Volume Hierarchy) management. The keen industry observer shared the benefits of this feature: "(BVH) allows for lower CPU overhead and VRAM footprint, higher graphical fidelity, and interactive game worlds with ray traced animated geometry (assets and characters) and destructible environments on a mass scale...the patent filings cover smarter BVH management to reduce the BVH construction overhead and storage size and even increasing performance with many of the filings, likely an attempt to match or possibly even exceed the capabilities of (NVIDIA's) RTX Mega Geometry (tech)."
Their TL;DR part #2 section compared forthcoming developments with a familiar rival current-gen solution: "the patents indicate a strong possibility of almost feature level parity with NVIDIA Blackwell in AMD's future GPU architectures likely as soon as 'RDNA 5/UDNA' based on the filing dates. We might even see RT perf parity with Blackwell at iso-raster perf, that's an identical FPS drop percentagewise between architectures...If more architectural changes make their way into next-gen RDNA than those afforded by the current publicly available patent filings then it is very likely to exceed NVIDIA Blackwell on all fronts, except likely only matching ReSTIR path tracing (PT) and RTX Mega Geometry functionality. If this is true then that would be AMD's 'Maxwell moment' but for RT."
60 Comments on AMD Patents Provide Early UDNA Insights - "Blackwell-esque" Ray Tracing Performance Could be Achievable
www.pcgamer.com/hardware/graphics-cards/from-the-developers-standpoint-they-love-this-strategyamds-plan-to-merge-its-rdna-and-cdna-gpu-architectures-to-a-unified-system-called-udna/
"AMD does not seem to support OMM or SER on its RDNA 2/3/4 GPUs, though Microsoft said that the red company is working with it on the widespread adoption of these technologies."
www.tomshardware.com/pc-components/gpus/microsoft-says-directx-raytracing-1-2-will-deliver-up-to-2-3x-performance-uplift
When AMD releases a new generation it is lambasted as DOA because it's generation behind.
The cognitive dissonance here is astounding. Wasn't the performance AMD offers now, just last gen hailed as revolutionary by Nvidia?
Does equaling last gen Nvidia RT perf suddenly mean this RT perf is DOA when it was fine for Nvidia users for years?
Besides Nvidia did not even improve their RT perf with Blackwell. We still essentially have Lovelace RT perf that RDNA4 has gotten very close to now.
We'll see if and how much Nvidia improves RT with Rubin. They may have very well run into diminishing returns like some people predicted would happen, years ago.
Time to start pushing PT and 10x framegen i guess...
Neither of those are relevant.
LMFAO. :roll::roll::roll::roll::roll: It's not hardware holding AMD back there, its the software. A new arch wont fix that, they need to build an ecosystem on par with CUDA, and that is gonna take YEARS of work.
For instance, if you're a movie studio, you can get the new RTX Pro 6000, test your rendering software on it, and then just push it to your GB200 servers, because the CUDA code will work exactly the same on both systems. But if you get an AMD Pro card, whatever you do with that can't transfer over to an AMD server, because the Pro card uses RDNA, while the server uses a fundamentally different CDNA architecture and AMD can't write code that works for both things equally.
Your argument might have held some water at the beginning when Nvidia was on it's 2nd gen Ampere and AMD was it's first gen RDNA 2 RT.
Now getting RDNA 4 vs Blackwell does not mean a huge compromise in RT or other features anymore. Sure it's not better in everything, but it's also not a generation behind. Why are you assuming UDNA will only match Lovelace RT when RDNA 4 already got really close to it?
I think it's pretty safe to assume UDNA will exceed Blackwell RT. The wildcard here is Rubin's own RT.
If they improve by enough they will maintain a lead (but not entire generation). If not we may actually see parity.
UDNA will also move from 4nm to 2nm most likely like Zen. We're talking about generations that will be released in ~Q4 2026/Q1 2027 here.
2nm production is ramping up this year. By the end of next year it should be mature enough to make bigger chips. This year it's limited to low power smartphone SoC's
The second link is talking about support for new RT systems, which AMD was going to include in their next gen anyway, whether it was RDNA 5 or UDNA.
The point I was making is that in the consumer market (i.e. gaming GPUs), AMD's strategy of being a generation (or 3) behind Nvidia but $50 cheaper has always failed. So if UDNA is the same (matches last-gen Blackwell, fails to match current gen Rubin), AMD will continue to be irrelevant in the discrete GPU market.
The ONLY other thing I can think of where AMD beat Nvidia was with DP 2.1 support on RX 7000. If you have any other examples, please feel free to share them. UDNA is NOT going to be on 2nm dawg. New nodes are EXPENSIVE and Apple is the VIP customer. 3nm started in 2022 but AMD and Nvidia weren't willing to pay the premium for Blackwell and RDNA 4. Nvidia is using 3nm for Rubin, and AMD is using 3nm for the PS6, which means UDNA and Zen 6 are going to be on 3nm as well. 2nm isn't even in production yet! www.tsmc.com/english/dedicatedFoundry/technology/logic/l_2nm
Also, there's no such thing as 4nm. There's just 5nm and 3nm. I think you're referring to 4N or N4, which are just customized 5nm nodes.
People are impressed with the current implementation (image quality) of FSR4, so it's not a stretch to think they will compete next gen, especially if they have developer and technological support moving forward.
Keep in mind that there are significant differences between mainstream RTX designs and datacenter ones. For example A100 was made on 7nm TSMC and has a different core configuration than mainstream Ampere made on Samsung 8nm. Volta V100 didn't have a mainstream equivalent other than the $3000 Titan V. Hopper like H100 was available only as a datacenter design. For Blackwell there's datacenter Blackwell 1.0 like B200 supporting CUDA Compute Capability 10.0 and mainstream Blackwell 2.0 like 5090 and RTX Pro 6000 supporting CUDA CC 12.0.
There's also differing performance characteristics like FP64 where datacenter designs have significantly higher performance thus needing CUDA SASS optimizations, and that's taken are of automatically by the CUDA runtime.
The crucial difference between AMD ROCm and NVIDIA CUDA is that the latter officially supports every NVIDIA generation and design including mainstream ones. AMD officially only support a few Radeon 7900 models, and that's it for the mainstream. It doesn't even work natively under Windows - one needs to either use WSL or Linux proper. Other cards can and do work, but you get absolutely no guarantees and no official support other than the community.
I'm not even getting into the huge number of optimized CUDA libraries that NVIDIA provides. AMD has a lot of catching up to do in this area as well.
Maybe with UDNA the software stack will be dramatically improved, but I've been hearing this from AMD for years now. They don't even support RDNA4 yet in ROCm.
RX 9070 series comes out and everyone is praising AMD for two things. FSR 4 and significant improvements in RT performance.
Now AMD seems to be rushing to cover the rest of the gap with Nvidia in RT performance and that's good.
Hope UDNA also fixes the performance with programs outside of gaming because there Nvidia is at a totally different level. AMD completely dropped the ball there, where GCN was competitive, the gaming only RDNA was just bad compared to Nvidia.
AMD has hardware scheduling where as Nvidia has software scheduling. This makes AMD use less CPU resources which matters more on slower/older CPU's.
AMD also supports ReBAR on really old cards where as Nvidia did not even bother adding it to their 20 series cards retroactively.
Same with frame generation where Nvidia arbitrarily restricted it to 30 series and multi-framegen that is also arbitrarily restricted to 40 series.
AMD's FSR 4 is very close to DLSS. Somewhere between DLSS 3 and 4 in terms of quality. Their framegen already matched Nvida's with FSR 3.1.
They dont have multi-frame gen and Ray Reconstruction analogues yet but im bettering these are coming before UDNA. AMD software control panel has been literally better than Nvidia's fractured and old system for years.
Only now is Nvidia consolidating everything into one single app, but it's still not complete. AMD has had this for years and is miles better that Nvidia's.
Currently the main area where AMD is behind is ironically DP 2.1 support. For some unknown reason they did not move from UHBR 13.5 support on RDNA 3 to UHBR20 support on RDNA 4. Blackwell uses UHBR20. And CUDA naturally but same could be said for everyone else who cant compete with it.
Aside from these, to a gamer i think RDNA 4 is close enough. AMD is behind in rendering applications tho and this is what they need to work on. UDNA is still at least ~1,5 years out but ok lets assume it's 3nm. This means node parity with Nvidia and improvement over 5nm.
Zen 6 will be on 2nm. That's pretty much confirmed as Intel's Nova Lake will also use 2nm. Not to mention Zen chiplets are small.
Nvidia has plenty of hardware scheduling. They require it for frame gen support, and added even more hardware scheduling for dual AI and CUDA core workloads with Blackwell. ReBAR is still in the trial phase, essentially. Only Intel's really incorporated it into the GPUs and requires it, Nvidia manages it game-by-game on the driver level even on the GPUs that support it. And it still barely makes a performance difference in most games. It's going to eventually be relevant, but by then, Turing and older cards will be obsolete and too old to run those games anyway. Are we doing this, again? This has already been explained to death many times. Your generations are off btw, you mean 40 and 50 series. Frame gen required the massively improved Optical Flow Accelerator in the 40 series, the OFA in the 30 and 20 series did not run fast enough to provide real-time interpolation of video games (but could still interpolate videos). The best they could have gotten is 1 interpolated frame every 4-5 "real" frames, which doesn't provide a meaningful performance improvement and would have just massively increased latency. On the 50 series, Optical Flow runs on the tensor cores, but requires the massively increased low-precision compute of the NEW FP4 hardware support. The 40 series and older just don't have the TOPS to do Optical Flow on the tensor cores or multi-frame gen without FP4 hardware (yet). That's because FSR 4 runs on tensor cores now instead of on the shaders. And again, comparing same generations, AMD is behind yet again, with just FP8 support in RDNA 4 vs FP4 support in Blackwell, which means no multi-frame gen. This is actually something that UDNA will hopefully address, because AMD has FP4 in their latest CDNA lineup, which shows how much the disconnect between consumer and enterprise architecture has hurt AMD. And for UDNA, yes they'll have node parity and an improvement, but they had that with RDNA 3 and 4.
If AMD decides to go all-in on 2nm, don’t expect low prices. However, significant performance gains are likely, especially since they haven’t adopted GDDR7 yet. When they do, future revisions will likely be much faster. :pimp: