• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

The future of RDNA on Desktop.

Joined
Sep 7, 2017
Messages
37 (0.01/day)
Now that AMD's RX9070 Series GPUs have been Announced and will be Released for Sales starting this Thursday I have been read rumors that this is it for RDNA on Desktop. The next Revision of RDNA, RDNA 5, will be for Mobile and Gaming Consoles. The next node for Desktop GPUs will be UDNA. If RDNA is successful beyond all expectations could AMD Leadership be pursuaded to do one more iteration of RDNA, call it RDNA4+ that is configured with 20GB of DDR7. The current Boards uses GDDR6. Call the new card the 9070 XTX and do a similar upgrade on the current 9070 vanilla with 20GB of GDDR7.

As you can tell UDNA1 doesn't give me the warm and fuzzys. And it looks to me AMD has a Winner in the 9070 Series.

Any thoughts ?
 
RDNA 4 is the last RDNA architecture. The next mainstream desktop one is UDNA. They decided so because it costs AMD too much to develop two architectures simultaneously, not to mention the sporadic ROCm support on (cheaper) gaming cards drivers away sales from people who need compute, but don't have the need and/or cash for an enterprise card, and have to choose Nvidia for CUDA. As for whether it's good for gaming or not, we'll see. Personally, I'm due for an upgrade anyway, so I'll just get a 9070 XT and call it quits for a good 2-3 gens.
 
Now that AMD's RX9070 Series GPUs have been Announced and will be Released for Sales starting this Thursday I have been read rumors that this is it for RDNA on Desktop. The next Revision of RDNA, RDNA 5, will be for Mobile and Gaming Consoles. The next node for Desktop GPUs will be UDNA. If RDNA is successful beyond all expectations could AMD Leadership be pursuaded to do one more iteration of RDNA, call it RDNA4+ that is configured with 20GB of DDR7. The current Boards uses GDDR6. Call the new card the 9070 XTX and do a similar upgrade on the current 9070 vanilla with 20GB of GDDR7.

As you can tell UDNA1 doesn't give me the warm and fuzzys. And it looks to me AMD has a Winner in the 9070 Series.

Any thoughts ?

We might see higher VRAM variants but I don't see AMD spending any time on significant revisions for future RDNA cards past the shortly to be released series.

That would just detract resources from UDNA and that is AMD's number one priority right now. Keep in mind that UDNA will now be including all their gaming, compute, and AI improvements from now on. The name might not sound exciting to gamers but it should be given that some of the benefits to compute and AI have already been integrated into RDNA4.
 
AMD could have easily given it 6144 units and GDDR7 with just a 16% larger die area of 420 mm2.
 
AMD could have easily given it 6144 units and GDDR7 with just a 16% larger die area of 420 mm2.
Nah we're not allowed to have high end cards anymore.
 
AMD could have easily given it 6144 units and GDDR7 with just a 16% larger die area of 420 mm2.
They could have, but a larger chip comes with higher manufacturing costs, which translates into a higher price. People are looking for Nvidia at that price point, so it would have been a futile investment.
 
Now that AMD's RX9070 Series GPUs have been Announced and will be Released for Sales starting this Thursday I have been read rumors that this is it for RDNA on Desktop. The next Revision of RDNA, RDNA 5, will be for Mobile and Gaming Consoles. The next node for Desktop GPUs will be UDNA. If RDNA is successful beyond all expectations could AMD Leadership be pursuaded to do one more iteration of RDNA, call it RDNA4+ that is configured with 20GB of DDR7. The current Boards uses GDDR6. Call the new card the 9070 XTX and do a similar upgrade on the current 9070 vanilla with 20GB of GDDR7.

As you can tell UDNA1 doesn't give me the warm and fuzzys. And it looks to me AMD has a Winner in the 9070 Series.

Any thoughts ?

No, it's not that simple. If the Navi 48's memory controller wasn't designed to handle G6X or G7 from the get go, upgrading the memory won't be a trivial matter. They could at most extend memory capacity by adopting 32 Gbit ICs. 16 GB at 256-bit likely means 8 16 Gbit ICs installed. But higher density memory chips or clamshell tend to have certain drawbacks like inferior timings or heat dissipation, power consumption in the case of a clamshell configuration.
 
This gen RDNA doesn't need GDDR7 to compete. Straight up.
 
Now that AMD's RX9070 Series GPUs have been Announced and will be Released for Sales starting this Thursday I have been read rumors that this is it for RDNA on Desktop. The next Revision of RDNA, RDNA 5, will be for Mobile and Gaming Consoles. The next node for Desktop GPUs will be UDNA. If RDNA is successful beyond all expectations could AMD Leadership be pursuaded to do one more iteration of RDNA, call it RDNA4+ that is configured with 20GB of DDR7. The current Boards uses GDDR6. Call the new card the 9070 XTX and do a similar upgrade on the current 9070 vanilla with 20GB of GDDR7.

As you can tell UDNA1 doesn't give me the warm and fuzzys. And it looks to me AMD has a Winner in the 9070 Series.

Any thoughts ?

I wouldn't worry about the name. Just think of 9070xt as a low-end UDNA chip you're getting early.

As I've said before, it's likely RDNA4 is competing with the next-gen low-end Rubin (128-bit) and *maybe* (if they release a high-clocked 16/32GB model) a cut-down slightly higher-end chip (7680sp on 3nm).

The point is 1080pRT mins and 1440pRT (sometimes upscaled) averages, and is likely how the XT/XTX will be divided. Again, I reference Wukong at 1080p and 1440p. Or Spider-man 2 in 1080p or 1440p.

It's just bringing the new market segmentation to you a little early because right now some games will be okay-ish at 1440p, but long-term it'll be a 1080p card. 1440p will be the 18GB market on 3nm.

This might piss some people off long-term, but remember that nVIDIA still thinks that's worth $750-1000. Again, this is why this chip should be cheap and they shouldn't sell it as 1440p/4k *really*...but whatever.

We're in a transitional point between pure raster and standardization of RT so I guess they will.

Beyond that, I wouldn't worry.
AMD could have easily given it 6144 units and GDDR7 with just a 16% larger die area of 420 mm2.
You mean ~12288sp (essentially 7900xtx but with RT/FSR improvements and high clock)? That's next-gen...I think they purposely avoided it this gen given it they'd have to sell it for >$1000.
When nVIDIA cut the costs of 4080 (segment), I think they decided it wasn't worth pursuing on 4nm. Next-gen AMD/nVIDIA are going to fight that battle HARD (~4090+ performance), as lots of people want that.

If you mean 6144sp, like 5070, that too (approx configuration; 3x 1920sp or 2048sp chiplets) is probably next-gen. Same with nVIDIA. It will likely replace 9070 xt almost directly in performance on 128-bit/GDDR7.
 
Last edited:
I mean 9090 for €900 with +50% more units 96x64, 192 rops and 24 Gbps memory, doesn't have to be 384 bit, 30% faster at almost no cost and die increase to crush the 5080. But i get it. Now it's 2:1 sides ratios of the chip and very nice also.
 
I mean 9090 for €900 with +50% more units 96x64, 192 rops and 24 Gbps memory, doesn't have to be 384 bit, 30% faster at almost no cost and die increase to crush the 5080. But i get it. Now it's 2:1 sides ratios of the chip and very nice also.
With AMD it would be 6 clusters of 1024 (2048 by the way I judge things), yes 192R/384-bit w/ GDDR5, but will be 96 or 128R/256-bit with GDDR7.

I agree they should have made it, monolithic or otherwise, giving us a more reasonably priced 4090, but they didn't.

A thing people need to understand is that at it's max clock potential N48 could use 425W. With 32GB, ~475W.
Yes, a chip with 50% more units only 25% more power for the chip +25-75W for the ram (if 24/48GB), but that's still an expensive/niche product using ~600W.
 
Last edited:
... I'll just get a 9070 XT and call it quits for a good 2-3 gens.

For 2-3 gens?!! Given how badly optimized most modern games are, I doubt it will run games at more than 1080p Medium in 4-5 years :twitch:
 
For 2-3 gens?!! Given how badly optimized most modern games are, I doubt it will run games at more than 1080p Medium in 4-5 years :twitch:

This guy gets it. :laugh:

I think it'll run 1080p fine for a while, until 18GB becomes a standard (just like 16GB is becoming over 12GB now), but I think 1440p is going to require some concessions (especially if you use features like FG, etc).
It's all relative. Some people (like me) will say 1080p, others will say they can make it work fine for 1440p (until more games are built towards 192-bit/18GB configs and/or built toward the PS6).
I'm not going to argue, because as I've said countless times...you can make ANYTHING work and different things are acceptable to different people.
I try to be cautious when explaining things (to the best of my ability/understanding at any given moment), which is why I trend towards worse-case and using minimum 60fps frame rates at high settings.
 
This gen RDNA doesn't need GDDR7 to compete. Straight up.
The large chunk of cache on-die mitigates the need for faster VRAM. My 6700XT can act like it has a 850 GB/s for memory bandwidth because of 96 MB of on-die cache. Actual memory bandwidth is 384.0 GB/s.

 
Last edited:
The large chunk of cache on-die mitigates the need for faster VRAM. My 6700XT can act like it has a 850 GB/s for memory bandwidth because of 96 MB of on-die cache. Actual memory bandwidth is 384.0 GB/s.


The 9070 XT is getting a cache size downgrade to 64 MB but it's faster than last gen.

Very close to the cache size of the 5070 Ti only the 5070 Ti has faster memory. If the 9070 XT ends up being around 5070 Ti level performance, that's a pretty big win for AMD as it'd mean they are doing more with less. Using GDDR6 brings the BOM of the radeon cards down and ensures better availability (assuming GDDR7 could be a production bottleneck).

Hard to say whether AMD is truly happy with GDDR6 or just decided to focus all engineering resources on next gen.
 
Looking forward to the next AMD gpu which will use UDNA, the current RDNA4 with the much improved ray tracing capability, things will get better
 
The 9070 XT is getting a cache size downgrade to 64 MB but it's faster than last gen.

Very close to the cache size of the 5070 Ti only the 5070 Ti has faster memory. If the 9070 XT ends up being around 5070 Ti level performance, that's a pretty big win for AMD as it'd mean they are doing more with less. Using GDDR6 brings the BOM of the radeon cards down and ensures better availability (assuming GDDR7 could be a production bottleneck).

Hard to say whether AMD is truly happy with GDDR6 or just decided to focus all engineering resources on next gen.
Id say the latter and not the former
 
The large chunk of cache on-die mitigates the need for faster VRAM.
I think this will return. If you do the math, AMD really needs to match nVIDIA's cache structure next-gen. If this means they double L3 (again), transition to similar L2 like nVIDIA, or something else...I don't know.
As I've said, with RDNA3 this dropped to roughly half of nVIDIA. With RDNA 4, perhaps 2/3.

All that matters is that the compute is fed, and I think it will be. As I said, I think 20gbps will be good for ~3150mhz, a typical overclock of 2700mhz (21.6gbps), ~3400mhz. This is likely the limit of 375w.
So, it makes sense. I like it.
The 9070 XT is getting a cache size downgrade to 64 MB but it's faster than last gen.

Very close to the cache size of the 5070 Ti only the 5070 Ti has faster memory. If the 9070 XT ends up being around 5070 Ti level performance, that's a pretty big win for AMD as it'd mean they are doing more with less. Using GDDR6 brings the BOM of the radeon cards down and ensures better availability (assuming GDDR7 could be a production bottleneck).

Hard to say whether AMD is truly happy with GDDR6 or just decided to focus all engineering resources on next gen.

You know what I think is funny? Faster L3. You know why the L3 is faster? Because it runs at core speed. You know what's faster in N48? The core. LOL.
Also, you may need to understand the difference between L2/L3. L2 is literally twice as fast.

I do commend them for what they pulled off with GDDR6. I think they made a very economical and common-sense chip, and if it can keep 1080pRT 60fps mins, that's cool. 5070 surely can't.
I also hope whatever they can pull off at the high-end makes people at 1440p happy-enough for a while...although it might be a little bit of a strech IMHO for my tastes. It'll probably be 'ok', but a novelty.
At least they don't sell it the way nVIDIA does 5080 16GB, which is quickly going to become a very humorous joke when people realize how much 2GB can and does make to minimum FR at 1440p.
5080 is the most expensive 1080p card I have ever seen. Seriously. No joke. Now imagine if a card was 9216sp @ 3780mhz with 18GB of ram. Not 10752 @ 2640mhz and 16GB. Exactly. 60fps. Heh.
 
At least they don't sell it the way nVIDIA does 5080 16GB, which is quickly going to become a very humorous joke when people realize how much 2GB can and does make to minimum FR at 1440p.
5080 is the most expensive 1080p card I have ever seen. Seriously. No joke. Now imagine if a card was 9216sp @ 3780mhz with 18GB of ram. Not 10752 @ 2640mhz and 16GB. Exactly. 60fps. Heh.

Regardless of vendor, 16 GB is a perfectly adequate amount of video memory for any GPU designed with 1440p and even 4K in mind, even when using demanding settings including ray tracing and often optional ultra high quality textures. Neither the 5080 nor the 9070 XT will suffer from frame buffer capacity-related issues before their computing power is exhausted on almost any game today and in for some time to come, a few niches notwithstanding. Those who really wanna go there have the 7900 XTX, 4090 and 5090 available for them.
 
Regardless of vendor, 16 GB is a perfectly adequate amount of video memory for any GPU designed with 1440p and even 4K in mind, even when using demanding settings including ray tracing and often optional ultra high quality textures. Neither the 5080 nor the 9070 XT will suffer from frame buffer capacity-related issues before their computing power is exhausted on almost any game today and in for some time to come, a few niches notwithstanding. Those who really wanna go there have the 7900 XTX, 4090 and 5090 available for them.
It really isn't. I know you think it is. It isn't. Look at this. Look at the scaling. I know you will say you don't play benchmarks. But, you need to understands games are built upon standards like benchmarks.

What is the aim for this segment. 120. This is shown in the GAMES I linked, by being roughly half. What can this not reach even at absurd clocks? 120. Why? ram. Add 2GB of ram and figure it out.

The founders edition runs at 2640mhz constant...so it'll help you figure it out.

Like I say, a 9216sp (@ ~3780) w/ 18GB would score 120 in that bench, and vicariously keep 60fps in games.

Want to make a bet on this? I will bet you one heart eyes emoji I am 100% correct.
 
Last edited:
I think 20gbps will be good for ~3150mhz, a typical overclock of 2700mhz (21.6gbps), ~3400mhz.
You empuzzled me. How does it even work? Game engines differ, various resolutions need different and not linearly changing amounts of VRAM bandwidth and on-die calculating power, plus God knows what else; all this makes 20 Gbps a massive overkill in some scenarios and not even half enough in others. I do agree it's not completely tragic to only have G6@20 with N48 in mind but there definitely will be scenarios when it's anemic. 9070 non-XT feels a lot more balanced (yet a lot worse per $).

I assume using G7 is way too expensive even if we're talking chips that can't even reasonably achieve 28 Gbps and this is why we don't see 9070 XT with, say, G7@26 coming.

Unfortunately no 160+ CU beast to witness. I'm sick of top tier GPU only being green and of AMD never even trying.
 
It really isn't. I know you think it is. It isn't. Look at this. Look at the scaling. I know you will say you don't play benchmarks. But, you need to understands games are built upon standards like benchmarks.

What is the aim for this segment. 120. This is shown in the GAMES I linked, by being roughly half. What can this not reach even at absurd clocks? 120. Why? ram. Add 2GB of ram and figure it out.

The reason the RTX 4090 outperforms the 5080 is because its core is (sometimes significantly) more powerful, not because the 5080 is memory capacity starved. To run into the limitations of 16 GB, you currently have to go all-out, with the most extreme scenarios (and it would still fit into memory by a hair) - not to mention W1zz tested this on the 5090, where this would be about 49.5% of its capacity. On a 16 GB card it would preallocate less, and use a bit less as a result.

vram.png
 
You empuzzled me. How does it even work? Game engines differ, various resolutions need different and not linearly changing amounts of VRAM bandwidth and on-die calculating power, plus God knows what else; all this makes 20 Gbps a massive overkill in some scenarios and not even half enough in others. I do agree it's not completely tragic to only have G6@20 with N48 in mind but there definitely will be scenarios when it's anemic. 9070 non-XT feels a lot more balanced (yet a lot worse per $).

I assume using G7 is way too expensive even if we're talking chips that can't even reasonably achieve 28 Gbps and this is why we don't see 9070 XT with, say, G7@26 coming.

Unfortunately no 160+ CU beast to witness. I'm sick of top tier GPU only being green and of AMD never even trying.

It's about saturation points. Limitations in what respect or another. I don't get what you don't understand?

GDDR7 truly isn't that expensive, but the reality is it's just not needed yet. It will be next gen. Right now it is largely used to sell as an advantage, more than actually being one.

As shown with N48, there are ways to optimize designs (towards current gaming trends) without it, as I explained. This comes with matching compute ability, cache, and memory bandwidth accordingly.

Buffer size is important to an extent because of current trends in relation to according compute capability and standardized resolution/settings (especially as new ones emerge, such as up-scaling/RT/FG).
It changes, and is fluid, but there are general guidelines that correspond to products. I have outlined this before. It is why a 12GB nVIDA card does not hit >45TF, and why 7800xt was limited this way, etc.
It is also how a product like 9070 xt, which is just over 45TF, and 9070 vanilla likely limited in this regard. Above allows a similar market until it generally limited by buffer, as in the case of 5080.

This is why you could literally see a 3500mhz N48 competing with 5080, even though a 5080 is capable of much higher compute throughput.
5080 can run at faster clock, but general playable settings (ie 60fps mins) are unattainable do to inadequate buffer. Why do you think nVIDIA clocked it at 2640mhz, and not closer to 3154 (like AMD)?

The amusing part is for 5080 to make sense it needs 24GB of memory (really 18-20, but 24GB is the only option) at 3.23ghz, which is the *exact* top of the 5nm dense voltage curve (seen on A15/M1).
If nVIDIA productizes this I will be amazed, because that is what they will sell next-gen and claim it is faster than a 5080...100% without doubt. They *could* do this, but likely won't unless they HAVE TO.
This is how you make safeguards within your product stack/chip designs, but don't push things forward too quickly if you don't have to (and hence save money on each inch of progress).

You will also notice 3 out of 10 5080 16GB skus reviewed on this website are capable of that for any prolonged period of time. This is by design.

The reason the RTX 4090 outperforms the 5080 is because its core is (sometimes significantly) more powerful, not because the 5080 is memory capacity starved. To run into the limitations of 16 GB, you currently have to go all-out, with the most extreme scenarios (and it would still fit into memory by a hair) - not to mention W1zz tested this on the 5090, where this would be about 49.5% of its capacity. On a 16 GB card it would preallocate less, and use a bit less as a result.

We'll see how it bears out. 4090 outperforms it for these reasons, yes, but minimums (and often benchmarks) are largely correlated to limitations.

I'm just sayin' a 9216sp part, with 3780mhz core and 36gbps ram is a much more rational pairing of resources. Again, we shall see!

As mentioned about a trillion times, I do not agree with how W1zard perceives memory usage, not how he tests it's limitations. There are many variables in this beyond purely benchmarks, such as swapping.

What happens when a game is targeting 16GB, as you say, and you turn on FG? nVIDIA doesn't want you to know and hence hides native FR when you do this. This what I'm talking about; many variables.
 
Last edited:
Back
Top