• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD "Strix Point" Zen 5 Monolithic Silicon has a 12-core CPU?

Reduced prices as the die area is much much lower. 4-core CPUs don't make any sense as TSMC N5 is pretty mature, so there is not much reason to disable two more working cores and reduce profit margins.

Now a 6-core Zen 4c CCD would make sense, and disabled units for Athlons.
I mean, as (FINALLY) Intel is raising the core count on their i3's it would make sense for AMD to follow suit. Be it with 4C Athlons and 6C R3's (also, while we're at it, why not making R5's go 8C and R7's go 12C, reserve R9 for 16C).
APU's are... monolithic design...
We are considering hypothetical Zen 4c designs, so that's where the CCX/CCD lingo came from.
 
Reduced prices as the die area is much much lower. 4-core CPUs don't make any sense as TSMC N5 is pretty mature, so there is not much reason to disable two more working cores and reduce profit margins.

Now a 6-core Zen 4c CCD would make sense, and disabled units for Athlons.
That would be a good replacement for the Zen 2 based SKUs. Despite the anticipated clock speed handicap, they should be faster than those.
 
I guess that a monolithic APU with 12 Zen5c cores will have a bit lower clocks than the Zen5 cores in the normal dual chiplets. But since it will have a relatively powerful iGPU in it, the graphic performance won't be hindered by the cache being halved. Price will be the point to discuss because technologically that APU will be a marvelous product.
 
Actually, for an Athlon it makes a lot of sense. If memory doesn't fail me K10 Phenoms and Athlons were separated similarly.
My first Socket A cpu was a Dothan, which was the Athlon with less L2.

All this got me thinking though. AMD is the only vendor NOT doing asymmetric cores. Mobile devices have it, Intel has it, Apple has it. I think AMD is going to have to do something here, or they might just lose the core race they started. Maybe them saying they have no intention of an E core is really that they will do something like the C core. Curious which one has more cache anyway, an AMD C core or an Intel E core?
 
My first Socket A cpu was a Dothan, which was the Athlon with less L2.

All this got me thinking though. AMD is the only vendor NOT doing asymmetric cores. Mobile devices have it, Intel has it, Apple has it. I think AMD is going to have to do something here, or they might just lose the core race they started. Maybe them saying they have no intention of an E core is really that they will do something like the C core. Curious which one has more cache anyway, an AMD C core or an Intel E core?
An AMD C core and the Raptor Cove E core have the same amount of L2 cache: 1 MB per core. In Raptor Cove's case, this L2 is shared across 4 cores for a total of 4 MB per cluster. Of course, the C core has L3 as well.
 
IIRC these aren't likely to be 12 full "P-cores" to use Intel's existing nomenclature.
No one can give reliable information on the config. AMD is testing both 8P + 4E and 12P versions. AMD probably don't know what they'll ship yet this far out from release.
 
If Phoenix is any indication, then AMD really needs to focus on driver support for their APUs as Phoenix has been a shitshow. Been months since products with it have been rolling off and there is no official support, only beta drivers from OEMs that basically barely work and no indication that AMD will ever actually integrate full support of it within the main drivers.

So a mega APU sounds great, but what is the point if AMD can't deliver the support for it?
 
If Phoenix is any indication, then AMD really needs to focus on driver support for their APUs as Phoenix has been a shitshow. Been months since products with it have been rolling off and there is no official support, only beta drivers from OEMs that basically barely work and no indication that AMD will ever actually integrate full support of it within the main drivers.

So a mega APU sounds great, but what is the point if AMD can't deliver the support for it?
Never buy AMD products on first release is the simple rule. Wait 6 months at least and then see what the driver feedback is.
 
If Phoenix is any indication, then AMD really needs to focus on driver support for their APUs as Phoenix has been a shitshow. Been months since products with it have been rolling off and there is no official support, only beta drivers from OEMs that basically barely work and no indication that AMD will ever actually integrate full support of it within the main drivers.

So a mega APU sounds great, but what is the point if AMD can't deliver the support for it?
I search in the titles in forum "Phoenix problem" and found nothing.
 
I search in the titles in forum "Phoenix problem" and found nothing.
You just didn't look hard enough, examples:




People have a lot of instability with those systems and as an example, report crashing and being completely unable to use the AI tools from Photoshop and/or Lighroom, which is a big ooof, in my view as part of the point of Phoenix was the hardware engine for AI stuff.

Never buy AMD products on first release is the simple rule. Wait 6 months at least and then see what the driver feedback is.

It has already been months since the first Phoenix products hit the market. I think they started to come in by May? It was supposed to be before, I think but AMD had delays.
 
You just didn't look hard enough, examples:




People have a lot of instability with those systems and as an example, report crashing and being completely unable to use the AI tools from Photoshop and/or Lighroom, which is a big ooof, in my view as part of the point of Phoenix was the hardware engine for AI stuff.



It has already been months since the first Phoenix products hit the market. I think they started to come in by May? It was supposed to be before, I think but AMD had delays.
Thanks.
I meant in the forum here at TPU, with the key phrase that was written in my previous comment and search in the titles.
 
An AMD C core and the Raptor Cove E core have the same amount of L2 cache: 1 MB per core. In Raptor Cove's case, this L2 is shared across 4 cores for a total of 4 MB per cluster. Of course, the C core has L3 as well.
Zen 4 and Zen 4c literally have the same IPC, while the P and E cores in 12th / 13th Gen has 50%+ IPC difference (P-core is 50% faster than E core at same frequency).
 
No one can give reliable information on the config. AMD is testing both 8P + 4E and 12P versions. AMD probably don't know what they'll ship yet this far out from release.
True, this is all speculation, but monolithic products are mobile parts and I'm not sure 12P fits the 15-45W envelope in a viable way. At least not with today's heat density and manufacturing capabilities.

Historically, monolithic mobile-first parts have been clocked up to 65W TDPs as desktop APUs later, but their primary focus is on performance/Watt because cooling and battery life are always more important than outright performance in a laptop part. That's why I'm agreeing with the rumour that these are 4x Zen5 and 8x Zen5c. Maybe I'm wrong, but at least you have my reasoning now.
 
Zen 4 and Zen 4c literally have the same IPC, while the P and E cores in 12th / 13th Gen has 50%+ IPC difference (P-core is 50% faster than E core at same frequency).
That's true, but the question was about the amount of cache per core for Zen 4c and Gracemont in Raptor Lake.
 
That's true, but the question was about the amount of cache per core for Zen 4c and Gracemont in Raptor Lake.
Zen 4c and Zen 4 mobile have the same cache. Gracemont is weird as it has shared L2 and L3. Raptor Lake Gracemont has 4MB per cluster of 4 cores, so 1MB each. As these are strictly for MT tasks, I suppose it is better as shared?

Zen 4c is just a space-optimized Zen 4 core, rather than a clockspeed-optimized one. L3 is halved like the mobile Zen 4 parts.
 
Zen 4c and Zen 4 mobile have the same cache. Gracemont is weird as it has shared L2 and L3. Raptor Lake Gracemont has 4MB per cluster of 4 cores, so 1MB each. As these are strictly for MT tasks, I suppose it is better as shared?

Zen 4c is just a space-optimized Zen 4 core, rather than a clockspeed-optimized one. L3 is halved like the mobile Zen 4 parts.
They are lower performance cores so sharing the L2 is good from two points of view:
  • higher capacity means more hits in L2
  • fewer misses to the much slower L3
Drawbacks compared to a private L2 are:
  • higher latency
  • more wiring (probably a 4 port crossbar) requires more area
  • lower bandwidth when more than one requester is active, i.e. when more than one core is accessing L2
Chips and Cheese did a deep dive on Gracemont in the 12900k. This was when it had 2 MB L2 per cluster. Let's compare to an AMD SOC that had 4 core clusters too. The latencies are:

Cache LevelGracemont size (kiB)Zen 2 size (kiB)Gracemont latency (cycles)Zen 2 latency (cycles)
L1D323234
L220485121712
L330163847444

Let's look at bandwidth from the cache in a mulithreaded test.

Cache LevelGracemont size (kiB)Zen 2 size (kiB)Gracemont bandwidth (GB/s)Zen 2 bandwidth (GB/s)
L1D32324991057
L22048512208535
L3301638461321

Now this makes it clear why Intel chose a large, shared L2. Intel's L3 has poor bandwidth for a Gracemont cluster, but it doesn't matter because the L2 makes up for it.

As an aside, Gracemont has a large L1 I-cache which is something that the bigger cores should copy. This figure from Hirki et al's paper on Haswell's power consumption also shows the benefit of a larger L1.

1689949589247.png
 
As an aside, Gracemont has a large L1 I-cache which is something that the bigger cores should copy. This figure from Hirki et al's paper on Haswell's power consumption also shows the benefit of a larger L1.
I reckon the production challenges of that are the main reason this is less common. Looking at the strict voltage and thermal requirements of the X3D chips. Burying more cache inside the core itself would be an engineering challenge.

It seems that a Gracemont cluster is more like a 4-thread core than 4 individual cores. I am sure that it technically isn't, but I get FX-8350 vibes...
 
I reckon the production challenges of that are the main reason this is less common. Looking at the strict voltage and thermal requirements of the X3D chips. Burying more cache inside the core itself would be an engineering challenge.

It seems that a Gracemont cluster is more like a 4-thread core than 4 individual cores. I am sure that it technically isn't, but I get FX-8350 vibes...
Sharing the L2 cache doesn't make it like Bulldozer. Bulldozer shared much more, including the front-end and the floating point execution units.

Apple has enormous L1 caches and Power 6, the first 5 GHz CPU, had 64 KiB L1 caches. The Power 6, unlike Apple, didn't have the advantage of current processes, but it also had the advantage of a high power budget. The first Zen had a 64 KiB L1 I-cache as well.
 
Back
Top