Sunday, December 19th 2021

AMD Radeon "Navi 3x" Could See 50% Increase in Shaders, Double the Cache Memory

AMD's next generation Radeon "Navi 3x" line of GPUs could see a 50% increase in shaders and a doubling Infinity Cache memory size, according to some educated-guesswork and intelligence by Greymon55, a reliable source with GPU leaks. The Navi 31, Navi 32, and Navi 33 chips are expected to debut the new RDNA3 graphics architecture, and succeed the 6 nm optical-shrinks of existing Navi 2x chips that AMD is rumored to be working on.

The top Navi 31 part allegedly features 60 workgroup processors (WGPs), or 120 compute units. Assuming an RDNA3 CU still holds 64 stream processors, you're looking at 7,680 stream processors, a 50% increase over Navi 21. The Navi 32 silicon features 40 WGPs, and exactly the same number of shaders as the current Navi 21, at 5,120. The smallest of the three, the Navi 33, packs 16 WGPs, or 2,048 shaders. There is a generational doubling in cache memory, with 256 MB on the Navi 31, 192 MB on the Navi 32, and 64 MB on the Navi 33. Interestingly, the memory sizes and bus widths are unchanged, but AMD could leverage faster GDDR6 memory types. 2022 will see the likes of Samsung ship GDDR6 chips with data-rates as high as 24 Gbps.
Source: Greymon55 (Twitter)
Add your own comment

44 Comments on AMD Radeon "Navi 3x" Could See 50% Increase in Shaders, Double the Cache Memory

#1
davideneco
It's literally the same specifications than the last time he just made a other tweet for like

Website are desperate
Posted on Reply
#2
Chomiq
And triple the price.
Posted on Reply
#3
Pumper
ChomiqAnd triple the price.
With 50% of current supply.
Posted on Reply
#4
AlwaysHope
When these hit the market, maybe my next gpu upgrade unless Intel come up with something before?
Love the speculation about supply & prices into the future, so entertaining! :laugh:
Posted on Reply
#6
Jeager
ChomiqAnd triple the price.
Should only be x2.5, x0.5 for the 50% shader increase and x2 for the double cache :D
Posted on Reply
#7
Chrispy_
It only matters if people can buy it, otherwise they may as well just stop calling it Navi and start calling it CMP.
Posted on Reply
#8
ARF
JeagerShould only be x2.5, x0.5 for the 50% shader increase and x2 for the double cache :D
AMD doesn't want to bankrupt its GPU division, does it? :D
Posted on Reply
#9
Punkenjoy
ARFWhen? :D

7689 shaders or double that up to 15360 shaders?


AMD Next-Gen RDNA 3 & RDNA 4 GPU Rumors: Over 50% Performance Increase, Increased Radeon RX 7000 Pricing & Launch In 2H 2022 (wccftech.com)
Still have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.


There are also no rumors on architectural change. Just a compute unit count. AMD use their regular units to do raytracing vs Nvidia that have dedicated units. We will see how RDNA3 perform in raytracing workload. Indeed the increased performance (and maybe cache) will help but will it be enough to compete in those workload with Ampere ? we will see.

As for the performance increase, it could be this

- 50% more core counts
- Increased frequency due to better nodes (3GHz?)
- More infinity cache (and faster too if 3GHz reachable)
- Higher memory bandwidth
- increased IPC

is 2.5x possible ? i do not know but we will need to have all those checked to be able to achieve the 2.5x people are speculating about.
Posted on Reply
#10
ARF
PunkenjoyStill have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.


There are also no rumors on architectural change. Just a compute unit count. AMD use their regular units to do raytracing vs Nvidia that have dedicated units. We will see how RDNA3 perform in raytracing workload. Indeed the increased performance (and maybe cache) will help but will it be enough to compete in those workload with Ampere ? we will see.

As for the performance increase, it could be this

- 50% more core counts
- Increased frequency due to better nodes (3GHz?)
- More infinity cache (and faster too if 3GHz reachable)
- Higher memory bandwidth
- increased IPC

is 2.5x possible ? i do not know but we will need to have all those checked to be able to achieve the 2.5x people are speculating about.
The speculations come from similar articles:


AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.
Posted on Reply
#11
Oberon
PunkenjoyStill have to see what they will do witch case, is it 256 MB per die (512 MB total) or 128 MB per die (256 MB total) the fact that it's now MCM seem to send rumors all over the place.

I hope it's 256 MB per die, or at least 512 MB total in the case of those cache would be in a third I/O or connector die.
256 MB on the IOD, none on the GCDs.
The speculations come from similar articles:


AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.
Hopper isn't even a consumer graphics part...
Posted on Reply
#12
mechtech
And optimized for mining

aka the legal way to get your own money printing press
Posted on Reply
#13
Vayra86
ARFThe speculations come from similar articles:


AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.
x2,2 ? Ah, yes of the unobtanium full die that never gets into the Geforce stack. Gotcha. Where'd that Volta thing go anyway ? :P

Meanwhile, the realistic gen-to-gen per-tier perf increase is and has always been 20-50%. 50% being the absolute jaw droppers, like Pascal.
Posted on Reply
#14
ARF
Vayra86x2,2 ? Ah, yes of the unobtanium full die that never gets into the Geforce stack. Gotcha. Where'd that Volta thing go anyway ? :p

Meanwhile, the realistic gen-to-gen per-tier perf increase is and has always been 20-50%. 50% being the absolute jaw droppers, like Pascal.
Well, architectural improvements, more shaders, die shrink, up the wattage to 450-500 watts and call it a hey-day.
Posted on Reply
#15
mastrdrver
ARFThe speculations come from similar articles:


AMD & NVIDIA Next-Gen Flagship GPUs Detailed: RDNA 3 Radeon RX 7900 XT With 15360 Cores, Ada Lovelace GeForce RTX 4090 With 18432 Cores (wccftech.com)

The details are not that interesting. It is obvious that they have found a way with significant architectural improvements to make the performance better.

The question is when the market will see it. Because if it is 2023, then Navi 21 would be more than two years old already!

Also, they are not optimistic at all about the shortages and the scalper pricings.
That graph from WCCFtech is fanciful thinking. At best there will be a 50% increase in performance. There is no way they're going to get a 100%+ increase in performance over the previous generation. You can't increase wattage enough to get that kind of generational uplift.

Even when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card. Even with in the same generation, a halving of the gpu power never produces half the performance. It's because there's always an issue with feeding the gpu enough data.
Posted on Reply
#16
TheoneandonlyMrK
mastrdrverThat graph from WCCFtech is fanciful thinking. At best there will be a 50% increase in performance. There is no way they're going to get a 100%+ increase in performance over the previous generation. You can't increase wattage enough to get that kind of generational uplift.

Even when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card. Even with in the same generation, a halving of the gpu power never produces half the performance. It's because there's always an issue with feeding the gpu enough data.
Firstly the node shrink and architectural changes can account for some of that 50% (possibly not proven)
Second they provided proof already that infinity cache works they just need to improve/enlarge.:)
Posted on Reply
#17
londiste
Oberon256 MB on the IOD, none on the GCDs.
Would it make sense to only have cache on IOD? Impact depends on what packaging they use but IOD is always a hop away.
Posted on Reply
#18
bug
At this point, I believe buyers would be happy with a meager performance improvement, but actual availability and reasonable prices (e.g. $200 for FHD gaming) instead.
Posted on Reply
#19
mastrdrver
TheoneandonlyMrKFirstly the node shrink and architectural changes can account for some of that 50% (possibly not proven)
Second they provided proof already that infinity cache works they just need to improve/enlarge.:)
Historical trends say that it's not going to happen.

6900 XT is about 40% faster then 5700 XT. 25% of that is from just the difference in clock speeds. There is no president for a 100%+ increase in performance (if that's what you're arguing for). If there was some simple way to do it, both nVidia and AMD (ATI) would have done it a long time ago.

Remember that even 30% increase over generation is an impressive feat these days. So, I do think a 50% increase is in a realistic possibility.

With Infinity Cache (IFC), they're increasing the size because they're putting two dies together. From the speculation I've read, each die has a two 64-bit memory controllers on them and they use the IFC (128MB each die) as the "crossbar" between the two. They already have a 58% hit rate according to AMD. I know CPU architectures are different, but even Tim Keller said there is a deminising return for increasing hit rate in caches as you have to increase the transistor count exponentially to receive a increase (though reduced) hit rate. I'm sure the same applied to GPU caches otherwise they would have already done it by now. Likewise, in RDNA2, the ICF is there to reduce the hits to memory. So not only are you looking to increase the hitrate and save memory transfers, but you're also looking have a good hit rate so the two dies don't have to hit memory that often either. Maybe that's what they plan on using the other 40% that's not being as efficient.
Posted on Reply
#20
TheoneandonlyMrK
mastrdrverHistorical trends say that it's not going to happen.

6900 XT is about 40% faster then 5700 XT. 25% of that is from just the difference in clock speeds. There is no president for a 100%+ increase in performance (if that's what you're arguing for). If there was some simple way to do it, both nVidia and AMD (ATI) would have done it a long time ago.

Remember that even 30% increase over generation is an impressive feat these days. So, I do think a 50% increase is in a realistic possibility.

With Infinity Cache (IFC), they're increasing the size because they're putting two dies together. From the speculation I've read, each die has a two 64-bit memory controllers on them and they use the IFC (128MB each die) as the "crossbar" between the two. They already have a 58% hit rate according to AMD. I know CPU architectures are different, but even Tim Keller said there is a deminising return for increasing hit rate in caches as you have to increase the transistor count exponentially to receive a increase (though reduced) hit rate. I'm sure the same applied to GPU caches otherwise they would have already done it by now. Likewise, in RDNA2, the ICF is there to reduce the hits to memory. So not only are you looking to increase the hitrate and save memory transfers, but you're also looking have a good hit rate so the two dies don't have to hit memory that often either. Maybe that's what they plan on using the other 40% that's not being as efficient.
Tim Keller?!

There's a whole lineup planned, I expect the top end with more than one chip could double performance alone without the other upgrades.

Depends on many things really they'll be two to four SKUs I hope I'm more optimistic then you but it's far from proven , I wouldn't debate my stance too hard, it's rumours.
Posted on Reply
#21
Punkenjoy
londisteWould it make sense to only have cache on IOD? Impact depends on what packaging they use but IOD is always a hop away.
From what i seen from Dieshot of Navi 22 and Navi 23, it look like the Cache is tied to the memory controller/memory bus and they had the option to put less or more cache per "lane"

So from that point of view, it would make sense.

But that would mean the infinity fabric link between the I/O die and the chiplet is huge. right now, on die, AMD state that it's 16 x 64b for NAVI21. it would mean probably at least 12 x 2 x 64b for Navi 31. Not undoable but i wonder how it will be expensive to make with an interposer.

I think at 2 GHz, infinity cache Bandwidth is around 1.9 TB/s.
Posted on Reply
#22
mastrdrver
TheoneandonlyMrKTim Keller?!

There's a whole lineup planned, I expect the top end with more than one chip could double performance alone without the other upgrades.

Depends on many things really they'll be two to four SKUs I hope I'm more optimistic then you but it's far from proven , I wouldn't debate my stance too hard, it's rumours.
Plus too I think they're expecting a 40%+ efficiency increase? I'm very skeptical of the more then 50% increase, but it will be amazing if true. Wouldn't have seen those improvement since the late 90s and 3Dfx SLI setups.
Posted on Reply
#23
Prima.Vera
hopefully they won't follow nGredia recepy:
more expensive, less yields, sold to black market first, etc...
Posted on Reply
#24
Oberon
mastrdrverEven when you had 5870 (which was a doubling of 4870) you didn't even see a 100% increase over the previous generation. You only saw (at best) a 50% increase in performance but that was at 2560x1600 with was on a $1,000 USD monitor that very few had. 40% increase at 1080p was the reality of that card.
That's a 58.7% increase...
Posted on Reply
Add your own comment