Thursday, August 10th 2023

AMD "Strix Point" Company's First Hybrid Processor, 4P+8E ES Surfaces

Beating previous reports that AMD is increasing the CPU core count of its mobile monolithic processors from the present 8-core/16-thread to 12-core/24-thread; we are learning that the next-gen processor from the company, codenamed "Strix Point," will in fact be the company's first hybrid processor. The chip is expected to feature two kinds of CPU cores, with "Zen 5" being the microarchitecture behind the performance cores, and "Zen 5c" behind the efficiency cores. An engineering sample featuring 4 P-cores, and 8 E-cores, surfaced on the web, thanks to Performancedatabases. A HWiNFO screenshot reveals the engineering sample's core-configuration of 4x P-cores and 8x E-cores, with identical L1 cache sizes. Things get a little fuzzy with the L2 cache size detection, and L3 cache.

We know from the current "Zen 4c" core design that it is essentially a compacted version of "Zen 4" designed for higher-density chiplets that have 16 cores; and that it has both the same ISA and IPC as "Zen 4," with the only difference being that "Zen 4c" is designed with lower amounts of shared L3 caches at their disposal, are generally configured with lower clock speeds, and have higher energy efficiency than "Zen 4." "Zen 4c" cores also 35% smaller in die-area than "Zen 4." The company could develop "Zen 5c" CPU cores with similar design goals.
The "Strix Point" silicon could hence have two CCX (CPU core complexes); one of which has the larger "Zen 5" P-cores and certain amount of L3 cache, and another CCX with the smaller "Zen 5c" cores, and their own L3 caches. This would essentially be similar to "Renoir," which has two 4-core CCXs of "Zen 2" cores. The L1 cache sizes for both kinds of cores is identical—48 KB L1D and 32 KB L1I, and it's likely that both core types have 1 MB of dedicated L2 caches per core. The L3 cache sizes could vary between the two CCXs, with the P-core CCX having 16 MB (4 MB per core), and the E-core CCX 8 MB (512 KB per core).

It would be interesting to imagine how AMD handles the hybrid architecture from a software standpoint. Intel uses Thread Director, a hardware-based solution that's designed to send the right kind of compute workload to the right kind of CPU core. AMD could either try to develop its own version of Thread Director, or use a less sophisticated OS-based solution such as what it's doing with its multi-CCD client processors.
Sources: Performancedatabases, IThome, VideoCardz
Add your own comment

86 Comments on AMD "Strix Point" Company's First Hybrid Processor, 4P+8E ES Surfaces

#1
R0H1T
btarunrAMD could either try to develop its own version of Thread Director, or use a less sophisticated OS-based solution such as what it's doing with its multi-CCD client processors.
I'd argue that a software based solution would be better, like process lasso, but the one baked in hardware would be faster & less flexible.
Posted on Reply
#2
_JP_
btarunror use a less sophisticated OS-based solution
If they learned anything from Bulldozer, they wont.
But then again, Windows 10/11 has features to enable and manage more types of processors, such as big.LITTLE approaches, compared to what Windows 7 could do.
Posted on Reply
#3
Camm
Considering how awful Intel's thread director has been, it might actually be better to rely on the OS scheduler?

I'd also hope these are monolithic rather than chipset based. With the die size saving of 4c touted, 8 cores shouldn't be much larger than 4p cores.
Posted on Reply
#4
john_
I prefer AMD's hybrid approach than Intel's, but I am speculating that AMD choose this to avoid having to design it's own hardware thread director like Intel. If they manage to build something like that, we might go the Intel way with a few P cores and a number of other less capable cores, clearly for marketing purposes(more cores in the same die area, more cores advertised, better sales). Intel is already moving in a three types of cores with it's next gen. AMD's denser cores aren't going to help much next year. They could have helped if they where ready for Alder Lake.
Posted on Reply
#5
Daven
This hybrid approach doesn’t make sense to me. If the e-cores have the same IPC as the p-cores, why not just use all the same cores and just let clock speeds adjust up and down like always? On battery, clocks are low. When plugged in, clocks are higher. Why have some cores that clock higher than otherwise identical cores that don’t?
Posted on Reply
#6
Camm
DavenThis hybrid approach doesn’t make sense to me. If the e-cores have the same IPC as the p-cores, why not just use all the same cores and just let clock speeds adjust up and down like always?
Die space as cache doesn't shrink well. Secondly cache is one of the most power hungry places on the die, so it saves efficiency. Furthermore, some tasks just don't need the cache, usually the same tasks scale well, so increased performance.
Posted on Reply
#7
Frick
Fishfaced Nincompoop
DavenThis hybrid approach doesn’t make sense to me. If the e-cores have the same IPC as the p-cores, why not just use all the same cores and just let clock speeds adjust up and down like always?
More cores/mm2, and power use. For many applications today 4 slower cores will be faster than 1 or even 2 faster cores. The approach is sensible, with the only downside being scheduling becomes even more important, but that'll be sorted out. Games doensn't really benefit, so far, but honestly who cares?
Posted on Reply
#8
Daven
FrickMore cores/mm2, and power use. For many applications today 4 slower cores will be faster than 1 or even 2 faster cores. The approach is sensible, with the only downside being scheduling becomes even more important, but that'll be sorted out. Games doensn't really benefit, so far, but honestly who cares?
So why not make all the cores 5c?
CammDie space as cache doesn't shrink well. Secondly cache is one of the most power hungry places on the die, so it saves efficiency. Furthermore, some tasks just don't need the cache, usually the same tasks scale well, so increased performance.
So why not make some SKUs with stacked cache and some without stacked cache? No need for ‘hybrid’ cores that are almost identical.
Posted on Reply
#9
ViperXZ
this is how hybrid (big/small - efficient) is really done, not like intels doing it
Posted on Reply
#10
konga
Calling the compact cores "efficiency" cores seems to be off the mark. I don't believe these are designed to be much more power efficient. Perhaps their more compact nature may make them slightly more power efficient, but more than anything, they seem to be designed to be space-efficient instead. They should offer a very similar level of performance to normal cores in most applications while taking up around half as much die area only.
Posted on Reply
#11
Frick
Fishfaced Nincompoop
DavenSo why not make all the cores 5c?
Because some stuff still benefits from bigger and faster cores, and beyond a certain number more cores might not increase performance. Depending on application 6 fast cores can be faster than 32 slow cores, but the opposite can also be true. For general purpose machines a hybrid approach makes sense, if nothing else because it'll help with power consumption/cooling.
Posted on Reply
#12
R0H1T
DavenSo why not make all the cores 5c?
Who says they can't or won't? I'm secretly hoping AMD releases a 32c monster for MSDT before their ultimate super Saiyan secret weapon 32c/128t zen6 to crush Intel :pimp:
Posted on Reply
#13
kondamin
base Clock 8800MHz, nice should make 10ghz on beefy cooling possible.
Posted on Reply
#14
R0H1T
Someone obviously fudged their numbers, boost is lower than base :shadedshu:
Posted on Reply
#15
Denver
That title conveys the wrong idea. In fact, both cores are high-performance,

What changes will be the maximum clock it can reach, but currently only one or two cores reach high clocks in laptops anyway.
Posted on Reply
#16
R0H1T
That's for zen5 so we don't really know how low xc cores will clock, neither for zen4 mind you.
Posted on Reply
#17
ToTTenTranz
Isn't Phoenix2 also a hybrid 2×Zen4 + 4×Zen4c solution?
The presence of Performance and Efficiency cores are mentioned in AMD's PPR for the Phoenix APUs, which is why it's been assumed that Phoenix2 has a hybrid design.



Unless that's a mistake on their programming reference guide, Strix Point shouldn't be AMD's first hybrid design.
Posted on Reply
#18
AnarchoPrimitiv
I'd argue thet the Zen5C cores shouldn't be referred to as "e-cores" as they have the exact same IPC as the Zen5 cores, which should give them a good advantage over Intel since Intel's e-cores do have lower performance.

That being said, I wonder if it'd be possible for AMD to make a desktop chip with 32 Zen5C cores AND v-cache, so basically you get all the density of Zen5C and you remove the lack of L3 cache...the best of both world's right?
Posted on Reply
#19
persondb
AnarchoPrimitivI'd argue thet the Zen5C cores shouldn't be referred to as "e-cores" as they have the exact same IPC as the Zen5 cores, which should give them a good advantage over Intel since Intel's e-cores do have lower performance.
That doesn't matter, we literally have no idea how high or low those cores will clock. I doubt it will be high at all. I am betting in the 2GHz to 3GHz range due to their far increased density.

Which then probably puts them in the same category of e-cores. As a lower performance core.

The thing really is that IPC never actually mattered, it's completely meaningless on it's own(and also varies far too much), if you need to do scheduling then it doesn't matter if the core has the same IPC or not, the only thing that matters is the core performance.
Posted on Reply
#20
ViperXZ
the Zen5C will be clocked way lower cause the density is too high to clock it very high (also to make it efficient, like its been used in data center). remember that Zen 3 and Zen 4 had "dead die space" to increase clocks compared to Zen 2 and Zen 1, even Zen1+ did this compared to Zen 1.
Posted on Reply
#21
Daven
FrickBecause some stuff still benefits from bigger and faster cores, and beyond a certain number more cores might not increase performance. Depending on application 6 fast cores can be faster than 32 slow cores, but the opposite can also be true. For general purpose machines a hybrid approach makes sense, if nothing else because it'll help with power consumption/cooling.
As you can see from some of the other posts, all the cores are fast. There are no efficiency cores in this design by the mainstream definition. The Zen ‘Xc’ cores are the same as the Zen ‘X’ cores. This is not a BIG.little design in the sense that there are less features in the little core versus BIG core.

I get the feeling AMD is just following along with the trend of hybrid CPUs as a marketing ploy. I do not like such shenanigans, however, since the Zen architecture is already efficient, we get all fast cores anyway and AMD gets to have the same trendy nomenclature as Intel and Arm. The proverbial have your cake and eat it too situation.
Posted on Reply
#23
persondb
DavenAs you can see from some of the other posts, all the cores are fast. There are no efficiency cores in this design by the mainstream definition. The Zen ‘Xc’ cores are the same as the Zen ‘X’ cores. This is not a BIG.little design in the sense that there are less features in the little core versus BIG core.
What you need to be a BIG.little design really is a performance difference. Phones had SoCs for years with two different cluster of A53 cores, one that has some high clocks and the other that has low clocks.

So it mostly boils down to clocks(we already know that cache is cut, so perfomance will likely be lower from that), which we don't have access at this moment, so there is no way we can say that all the cores are fast.
Posted on Reply
#24
Assimilator
R0H1TI'd argue that a software based solution would be better, like process lasso, but the one baked in hardware would be faster & less flexible.
It's going to be a dual solution. Hardware for the more granular decisions, software (operating system) for more fine-grained.
AnarchoPrimitivI'd argue thet the Zen5C cores shouldn't be referred to as "e-cores" as they have the exact same IPC as the Zen5 cores, which should give them a good advantage over Intel since Intel's e-cores do have lower performance.
DavenThe Zen ‘Xc’ cores are the same as the Zen ‘X’ cores.
No. In implementation the c-cores will have less L3 cache and be clustered more densely, which will undoubtedly negatively affect their performance characteristics. The fact that they're capability-identical to "big" Zen cores is irrelevant, unless you want to have a stupid e-peen war about whether this implementation is "better" than Intel's.
Posted on Reply
#25
Daven
persondbWhat you need to be a BIG.little design really is a performance difference. Phones had SoCs for years with two different cluster of A53 cores, one that has some high clocks and the other that has low clocks.

So it mostly boils down to clocks(we already know that cache is cut, so perfomance will likely be lower from that), which we don't have access at this moment, so there is no way we can say that all the cores are fast.
As far as I’m aware, ARM BIG.little cores are much different and not just lower clocks. For instance, A57 is out of order and A53 is in order pipelines. Can you point to product examples where BIG.little is all A53 or some other same cores with different clocks only?
Posted on Reply
Add your own comment
Sep 16th, 2024 14:28 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts