Monday, May 16th 2022

AMD Ryzen 7000 "Phoenix" APUs with RDNA3 Graphics to Rock Large 3D V-Cache

AMD's next-generation Ryzen 7000-series "Phoenix" mobile processors are all the rage these days. Bound for 2023, these chips feature a powerful iGPU based on the RDNA3 graphics architecture, with performance allegedly rivaling that of a GeForce RTX 3060 Laptop GPU—a popular performance-segment discrete GPU. What's more, AMD is also taking a swing at Intel in the CPU core-count game, by giving "Phoenix" a large number of "Zen 4" CPU cores. The secret ingredient pushing this combo, however, is a large cache.

AMD has used large caches to good effect both on its "Zen 3" processors, such as the Ryzen 7 5800X3D, where they're called 3D Vertical Cache (3D V-cache); as well as its Radeon RX 6000 discrete GPUs, where they're called Infinity Cache. The only known difference between the two is that the latter is fully on-die, while the former is stacked on top of existing silicon IP. It's being reported now, that "Phoenix" will indeed feature a stacked 3D V-cache.
The exact function of this isn't known—whether it serves as a last-level cache for the CPU or iGPU. AMD's APU architecture differs from Intel's processors that have iGPUs. On the Intel chips, the L3 cache serves as town-square for the entire SoC, with each IP block contributing an L3 cache slice that make up a functionally-contiguous cache that all IP blocks can equally address over the Ring Bus. On AMD APUs such as "Cezanne" or "Rembrandt," the L3 cache is part of the CCX (CPU cores complex), and serves exclusively as last-level cache for the CPU cores. The iGPU has its own LLC, and the Infinity Fabric interconnect is the ether binding all IP blocks on the silicon.

The obvious direction for development in future APUs could be a unification of last-level cache for the CCX and iGPU, provided the cache is large enough for the function—and this can be accomplished by stacked cache. An RDNA2 GPU with performance rivaling the RTX 3060 Laptop GPU, the Radeon RX 6650M XT, based on the "Navi 23" silicon, has 32 MB of Infinity Cache. This means, with some clever cache memory-management, an LLC size in the neighborhood of 64 MB could emerge feasible for the APU.
Source: Greymon55 (Twitter)
Add your own comment

45 Comments on AMD Ryzen 7000 "Phoenix" APUs with RDNA3 Graphics to Rock Large 3D V-Cache

#26
Oberon
R0H1TYes I missed that, but the point about pricing remains. 5800x 3D is expensive because of yields as well, that thing is harder to make & is more "fragile" apparently!
So "fragile" and hard to make that AMD more than doubled their monthly production targets after launch, from 20K units/month to 50K.
Posted on Reply
#27
john_
Funny that we will end up paying $500 for a.... high end APU in a few years and we will be considering the price fair.
Posted on Reply
#28
Valantar
john_Funny that we will end up paying $500 for a.... high end APU in a few years and we will be considering the price fair.
If it delivers equivalent performance to a $300 CPU and a $300 GPU with better efficiency, that's a steal. If not, then it's kind of silly, but would fit some needs.
Posted on Reply
#29
SL2
AquinusGo look over at Phoronix and you'll be astonished at how much an improvement that 768MB of LLC on those EPYC chips makes. It's pretty amazing to be honest.
Well that's offtopic, as that's not even my point.

Taking 64 cores and disable 48 of them (faulty or not), just to get this monstrosity called 7373X, that's insane. It's unheard of.
That doesn't make it a bad product, but no one said it was to begin with.

I won't bother reading the Phoronix article again because it describes the increased performance through benchmarks, which is a whole different thing.
Posted on Reply
#30
Denver
My bet would be $400-450 for the top-of-the-line model, and even then it would have an absurd performance per dollar.
Posted on Reply
#32
Aquinus
Resident Wat-man
MatsWell that's offtopic, as that's not even my point.

Taking 64 cores and disable 48 of them (faulty or not), just to get this monstrosity called 7373X, that's insane. It's unheard of.
That doesn't make it a bad product, but no one said it was to begin with.

I won't bother reading the Phoronix article again because it describes the increased performance through benchmarks, which is a whole different thing.
It shows that with fewer cores, that cache might make it perform just as well as the 24c variant without the extra cache which might justify the pricing scheme. That's mainly my point. Also, as far as you know those CCDs might not be fully usable anyways and would have otherwise been tossed, but that means that pricing has to remain consistent with the other products. It doesn't seem as insane to me as you're making it out to be.
Posted on Reply
#33
ModEl4
So not long after the 1536SP rumour comes the cache leak.
This is interesting, if 1536 RDNA3 SP and 64MB V-cache rumour is true, it has the potential to be up to 2.5X Rembrandt.
So higher than 60W 3060M actually for the desktop DIY iteration in an optimistic scenario.
But we are talking Q3 2023 for the DIY market, how knows by then what the competition is going to offer, for Intel mobile however the rumour is for a 2560SP 14th gen processor (latest at Q3 2023)
Posted on Reply
#34
alan242
AquinusGo look over at Phoronix and you'll be astonished at how much an improvement that 768MB of LLC on those EPYC chips makes. It's pretty amazing to be honest.
Yep, IBM mainframe CPUs have been steadily increasing L3 cache - 64MB, 128MB and currently 256MB. They also have a couple of GB of L4 cache.

I'm surprised no-one has brought back cache DIMMs and populated them with eDRAM as L4 for servers or desktops.

"Cache is King"
Posted on Reply
#35
SL2
AquinusIt shows that with fewer cores, that cache might make it perform just as well as the 24c variant without the extra cache which might justify the pricing scheme.
That goes without saying, but the fact that 7373X cost more than 7473X is what I don't understand.
Posted on Reply
#36
Valantar
alan242Yep, IBM mainframe CPUs have been steadily increasing L3 cache - 64MB, 128MB and currently 256MB. They also have a couple of GB of L4 cache.

I'm surprised no-one has brought back cache DIMMs and populated them with eDRAM as L4 for servers or desktops.

"Cache is King"
Latencies over that kind of distance would be far too high - it wouldn't be much faster than DRAM.
MatsWell that's offtopic, as that's not even my point.

Taking 64 cores and disable 48 of them (faulty or not), just to get this monstrosity called 7373X, that's insane. It's unheard of.
That doesn't make it a bad product, but no one said it was to begin with.

I won't bother reading the Phoronix article again because it describes the increased performance through benchmarks, which is a whole different thing.
Nothing weird about this - it's a special SKU for special server use cases. Some workloads don't scale across many threads but can make use of as much cache per core as you can possibly get; some software is licenced per core. If the workloads can make use of the configuration, and businesses are willing to pay, AMD gets a way to not waste defective chips.
Posted on Reply
#37
trsttte
alan242Yep, IBM mainframe CPUs have been steadily increasing L3 cache - 64MB, 128MB and currently 256MB. They also have a couple of GB of L4 cache.

I'm surprised no-one has brought back cache DIMMs and populated them with eDRAM as L4 for servers or desktops.

"Cache is King"
IBM servers do that, they discarded L3 for much bigger L2 and the cores are able to pool cache from different cores or socks and L4 dram/L2 even on different racks

relevant timestamp t=770s

Posted on Reply
#38
SL2
ValantarNothing weird about this - it's a special SKU for special server use cases. Some workloads don't scale across many threads but can make use of as much cache per core as you can possibly get; some software is licenced per core. If the workloads can make use of the configuration, and businesses are willing to pay, AMD gets a way to not waste defective chips.
I feel like I can't really explain myself good enough.

1 - I never said it's a bad product.

2 - I never said it's has no purpose.

3 - I just said that it's a crazy idea to begin with. Benchmarks won't change that because of 1 & 2.
Posted on Reply
#39
R0H1T
OberonSo "fragile" and hard to make that AMD more than doubled their monthly production targets after launch, from 20K units/month to 50K.
And how many 5800x do they make?
alan242eDRAM as L4 for servers or desktops.
Probably not as cost effective, efficient or fast as say a stack of HBM on package!
Posted on Reply
#40
AusWolf
MusselsDDR5 is gunna smash that higher than we're used to, combining the higher bandwidth with the lower latency from the cache should be a real winner of a design
As far as I've seen, all DDR standard doubled the standard speed compared to the last (DDR-400, DDR2-800, DDR3-1600, DDR4-3200).
If this trend continues, then mainstream DDR5 will settle at 6400 MHz which is 128 GB/s at best.
Posted on Reply
#41
Aquinus
Resident Wat-man
Mats3 - I just said that it's a crazy idea to begin with. Benchmarks won't change that because of 1 & 2.
FWIW, the 7373X has a base clock of 3.05Ghz versus 2.8 on the 3473X. Boost on the former is 100Mhz higher than boost on the latter as well. The real question is if cache hit ratios are any different from 8 fewer cores because off the bat, performance should be better from a single-threaded perspective because that chip runs at higher clocks. For only $285 USD, it might make more sense to go with one or the other since that's a drop in the bucket, particularly if you're buying the platform for the I/O and demand lower latency over higher throughput.

I see what you're saying, but I suspect there is something we haven't factored in.
Posted on Reply
#42
SL2
AquinusFor only $285 USD, it might make more sense to go with one or the other since that's a drop in the bucket, particularly if you're buying the platform for the I/O and demand lower latency over higher throughput.
Yeah, it's really nothing, but pricing could just as well have been $285 more for the 7473X. Or, like you say, maybe it's about clock speed.
Posted on Reply
#43
Valantar
AquinusFWIW, the 7373X has a base clock of 3.05Ghz versus 2.8 on the 3473X. Boost on the former is 100Mhz higher than boost on the latter as well. The real question is if cache hit ratios are any different from 8 fewer cores because off the bat, performance should be better from a single-threaded perspective because that chip runs at higher clocks. For only $285 USD, it might make more sense to go with one or the other since that's a drop in the bucket, particularly if you're buying the platform for the I/O and demand lower latency over higher throughput.

I see what you're saying, but I suspect there is something we haven't factored in.
For workloads that scale better with cache size than core count, and especially software that's licenced per core, this would make perfect sense. That price difference is likely less than a single core licence for whatever software it's likely to run. Negligible in the long run - and especially if that 50% increase in L3/core matters to you (which it will to a lot of database processing and the like).
MatsI feel like I can't really explain myself good enough.

1 - I never said it's a bad product.

2 - I never said it's has no purpose.

3 - I just said that it's a crazy idea to begin with. Benchmarks won't change that because of 1 & 2.
I know you didn't but you expressed shock and confusion as to its existence, while low core count, large cache server/datacenter/HPC CPUs like this has been around for quite a while. There was the 8c EPYC 7251 (which seemed mostly a "tons of PCIe for cheap and not a lot of power") play; two 8c16t Zen2 EPYCs, one with 64MB L3 and one with 128MB. These are very odd from a consumer POV, but they're hardly unique, and they have their use case. AMD has just found a niche where it can deliver something Intel never did. And, to be clear, for the applications where this makes sense, the performance benefits will be very real.
Posted on Reply
#44
SL2
ValantarI know you didn't but you expressed shock and confusion as to its existence, while low core count, large cache server/datacenter/HPC CPUs like this has been around for quite a while. There was the 8c EPYC 7251 (which seemed mostly a "tons of PCIe for cheap and not a lot of power") play; two 8c16t Zen2 EPYCs, one with 64MB L3 and one with 128MB. These are very odd from a consumer POV, but they're hardly unique, and they have their use case. AMD has just found a niche where it can deliver something Intel never did. And, to be clear, for the applications where this makes sense, the performance benefits will be very real.
More miscommunication, as nothing you post has anything to do with my point. The reason seems to be that you take my words too literally.

A similar situation is if I would call the 3090 Ti a monstrosity because of the high power consumption, and that causes someone explain the GPU market or how Nvidia Ampere works.

When I wrote
Mats48 MB L3 per core ought to be enough for everyone. :D
..a while back I felt no confusion or shock, and the same goes for today. I don't want you to feel the need to explain and repeat more info about Milan-X, as I won't be buying anything like that.

If you just want to have the last word, well, I can't stop you. I have no use for it. ;)
Posted on Reply
#45
Oberon
R0H1TAnd how many 5800x do they make?
It doesn't matter, because it's not about how many of those they make. Their capacity to make these chips is quite small at the moment, limited by the advanced packaging techniques required, and mostly dominated by Milan-X shipments. The fact that they more than doubled the amount they planned to make per month means that the process is much more successful than they anticipated.
Posted on Reply
Add your own comment
Apr 25th, 2024 00:33 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts