• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Radeon "Navi 3x" Could See 50% Increase in Shaders, Double the Cache Memory

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,670 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
AMD's next generation Radeon "Navi 3x" line of GPUs could see a 50% increase in shaders and a doubling Infinity Cache memory size, according to some educated-guesswork and intelligence by Greymon55, a reliable source with GPU leaks. The Navi 31, Navi 32, and Navi 33 chips are expected to debut the new RDNA3 graphics architecture, and succeed the 6 nm optical-shrinks of existing Navi 2x chips that AMD is rumored to be working on.

The top Navi 31 part allegedly features 60 workgroup processors (WGPs), or 120 compute units. Assuming an RDNA3 CU still holds 64 stream processors, you're looking at 7,680 stream processors, a 50% increase over Navi 21. The Navi 32 silicon features 40 WGPs, and exactly the same number of shaders as the current Navi 21, at 5,120. The smallest of the three, the Navi 33, packs 16 WGPs, or 2,048 shaders. There is a generational doubling in cache memory, with 256 MB on the Navi 31, 192 MB on the Navi 32, and 64 MB on the Navi 33. Interestingly, the memory sizes and bus widths are unchanged, but AMD could leverage faster GDDR6 memory types. 2022 will see the likes of Samsung ship GDDR6 chips with data-rates as high as 24 Gbps.



View at TechPowerUp Main Site
 
And triple the price.
 
When these hit the market, maybe my next gpu upgrade unless Intel come up with something before?
Love the speculation about supply & prices into the future, so entertaining! :laugh:
 
When? :D

7689 shaders or double that up to 15360 shaders?

1639988848900.png

AMD Next-Gen RDNA 3 & RDNA 4 GPU Rumors: Over 50% Performance Increase, Increased Radeon RX 7000 Pricing & Launch In 2H 2022 (wccftech.com)
 
It only matters if people can buy it, otherwise they may as well just stop calling it Navi and start calling it CMP.
 

Still have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.


There are also no rumors on architectural change. Just a compute unit count. AMD use their regular units to do raytracing vs Nvidia that have dedicated units. We will see how RDNA3 perform in raytracing workload. Indeed the increased performance (and maybe cache) will help but will it be enough to compete in those workload with Ampere ? we will see.

As for the performance increase, it could be this

- 50% more core counts
- Increased frequency due to better nodes (3GHz?)
- More infinity cache (and faster too if 3GHz reachable)
- Higher memory bandwidth
- increased IPC

is 2.5x possible ? i do not know but we will need to have all those checked to be able to achieve the 2.5x people are speculating about.
 
Last edited:
Still have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.


There are also no rumors on architectural change. Just a compute unit count. AMD use their regular units to do raytracing vs Nvidia that have dedicated units. We will see how RDNA3 perform in raytracing workload. Indeed the increased performance (and maybe cache) will help but will it be enough to compete in those workload with Ampere ? we will see.

As for the performance increase, it could be this

- 50% more core counts
- Increased frequency due to better nodes (3GHz?)
- More infinity cache (and faster too if 3GHz reachable)
- Higher memory bandwidth
- increased IPC

is 2.5x possible ? i do not know but we will need to have all those checked to be able to achieve the 2.5x people are speculating about.

The speculations come from similar articles:

1640020108778.png

AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.
 

Attachments

  • 1640020062468.png
    1640020062468.png
    34.9 KB · Views: 82
Still have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.

256 MB on the IOD, none on the GCDs.

The speculations come from similar articles:

View attachment 229577
AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.

Hopper isn't even a consumer graphics part...
 
And optimized for mining

aka the legal way to get your own money printing press
 
The speculations come from similar articles:

View attachment 229577
AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.

x2,2 ? Ah, yes of the unobtanium full die that never gets into the Geforce stack. Gotcha. Where'd that Volta thing go anyway ? :P

Meanwhile, the realistic gen-to-gen per-tier perf increase is and has always been 20-50%. 50% being the absolute jaw droppers, like Pascal.
 
x2,2 ? Ah, yes of the unobtanium full die that never gets into the Geforce stack. Gotcha. Where'd that Volta thing go anyway ? :p

Meanwhile, the realistic gen-to-gen per-tier perf increase is and has always been 20-50%. 50% being the absolute jaw droppers, like Pascal.

Well, architectural improvements, more shaders, die shrink, up the wattage to 450-500 watts and call it a hey-day.
 
The speculations come from similar articles:

View attachment 229577
AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.

That graph from WCCFtech is fanciful thinking. At best there will be a 50% increase in performance. There is no way they're going to get a 100%+ increase in performance over the previous generation. You can't increase wattage enough to get that kind of generational uplift.

Even when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card. Even with in the same generation, a halving of the gpu power never produces half the performance. It's because there's always an issue with feeding the gpu enough data.
perfrel.gif
 
That graph from WCCFtech is fanciful thinking. At best there will be a 50% increase in performance. There is no way they're going to get a 100%+ increase in performance over the previous generation. You can't increase wattage enough to get that kind of generational uplift.

Even when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card. Even with in the same generation, a halving of the gpu power never produces half the performance. It's because there's always an issue with feeding the gpu enough data.
perfrel.gif
Firstly the node shrink and architectural changes can account for some of that 50% (possibly not proven)
Second they provided proof already that infinity cache works they just need to improve/enlarge.:)
 
Last edited:
256 MB on the IOD, none on the GCDs.
Would it make sense to only have cache on IOD? Impact depends on what packaging they use but IOD is always a hop away.
 
At this point, I believe buyers would be happy with a meager performance improvement, but actual availability and reasonable prices (e.g. $200 for FHD gaming) instead.
 
Firstly the node shrink and architectural changes can account for some of that 50% (possibly not proven)
Second they provided proof already that infinity cache works they just need to improve/enlarge.:)

Historical trends say that it's not going to happen.

6900 XT is about 40% faster then 5700 XT. 25% of that is from just the difference in clock speeds. There is no president for a 100%+ increase in performance (if that's what you're arguing for). If there was some simple way to do it, both nVidia and AMD (ATI) would have done it a long time ago.

Remember that even 30% increase over generation is an impressive feat these days. So, I do think a 50% increase is in a realistic possibility.

With Infinity Cache (IFC), they're increasing the size because they're putting two dies together. From the speculation I've read, each die has a two 64-bit memory controllers on them and they use the IFC (128MB each die) as the "crossbar" between the two. They already have a 58% hit rate according to AMD. I know CPU architectures are different, but even Tim Keller said there is a deminising return for increasing hit rate in caches as you have to increase the transistor count exponentially to receive a increase (though reduced) hit rate. I'm sure the same applied to GPU caches otherwise they would have already done it by now. Likewise, in RDNA2, the ICF is there to reduce the hits to memory. So not only are you looking to increase the hitrate and save memory transfers, but you're also looking have a good hit rate so the two dies don't have to hit memory that often either. Maybe that's what they plan on using the other 40% that's not being as efficient.
 
Historical trends say that it's not going to happen.

6900 XT is about 40% faster then 5700 XT. 25% of that is from just the difference in clock speeds. There is no president for a 100%+ increase in performance (if that's what you're arguing for). If there was some simple way to do it, both nVidia and AMD (ATI) would have done it a long time ago.

Remember that even 30% increase over generation is an impressive feat these days. So, I do think a 50% increase is in a realistic possibility.

With Infinity Cache (IFC), they're increasing the size because they're putting two dies together. From the speculation I've read, each die has a two 64-bit memory controllers on them and they use the IFC (128MB each die) as the "crossbar" between the two. They already have a 58% hit rate according to AMD. I know CPU architectures are different, but even Tim Keller said there is a deminising return for increasing hit rate in caches as you have to increase the transistor count exponentially to receive a increase (though reduced) hit rate. I'm sure the same applied to GPU caches otherwise they would have already done it by now. Likewise, in RDNA2, the ICF is there to reduce the hits to memory. So not only are you looking to increase the hitrate and save memory transfers, but you're also looking have a good hit rate so the two dies don't have to hit memory that often either. Maybe that's what they plan on using the other 40% that's not being as efficient.
Tim Keller?!

There's a whole lineup planned, I expect the top end with more than one chip could double performance alone without the other upgrades.

Depends on many things really they'll be two to four SKUs I hope I'm more optimistic then you but it's far from proven , I wouldn't debate my stance too hard, it's rumours.
 
Would it make sense to only have cache on IOD? Impact depends on what packaging they use but IOD is always a hop away.

From what i seen from Dieshot of Navi 22 and Navi 23, it look like the Cache is tied to the memory controller/memory bus and they had the option to put less or more cache per "lane"

So from that point of view, it would make sense.

But that would mean the infinity fabric link between the I/O die and the chiplet is huge. right now, on die, AMD state that it's 16 x 64b for NAVI21. it would mean probably at least 12 x 2 x 64b for Navi 31. Not undoable but i wonder how it will be expensive to make with an interposer.

I think at 2 GHz, infinity cache Bandwidth is around 1.9 TB/s.
 
Tim Keller?!

There's a whole lineup planned, I expect the top end with more than one chip could double performance alone without the other upgrades.

Depends on many things really they'll be two to four SKUs I hope I'm more optimistic then you but it's far from proven , I wouldn't debate my stance too hard, it's rumours.

Plus too I think they're expecting a 40%+ efficiency increase? I'm very skeptical of the more then 50% increase, but it will be amazing if true. Wouldn't have seen those improvement since the late 90s and 3Dfx SLI setups.
 
hopefully they won't follow nGredia recepy:
more expensive, less yields, sold to black market first, etc...
 
Even when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card.

That's a 58.7% increase...
 
Back
Top