Wednesday, January 5th 2022

AMD Readying 16-core "Zen 4" CCDs Exclusively for the Client Segment with an Answer to Intel E-cores?

AMD already declared the CPU core counts of its EPYC "Genoa" and "Bergamo" processors to top out at 96 and 128, respectively, a core-count believed to have been facilitated by the larger fiberglass substrate of the next-gen SP5 CPU socket, letting AMD add more 8-core "Zen 4" chiplets, dubbed CPU complex dies (CCDs). Until now, AMD has used the chiplet as a common component between its EPYC enterprise and Ryzen desktop processors, to differentiate CPU core counts.

A fascinating theory that hit the rumor-mill, indicates that the company might leverage 5 nm (TSMC N5) carve out larger CCDs with up to 16 "Zen 4" CPU cores. Half of these cores are capped at a much lower power budget, essentially making them efficient-cores. This is a concept AMD appears to be carrying over from its 15-Watt class mobile processors, which see the CPU cores operate under an aggressive power-management. These cores still turn out a reasonable amount of performance, and are functionally identical to the ones on 105 W desktop processors with a relaxed power budget.
Since the "fat" and "slim" cores are functionally identical to each other; AMD need not develop a complex middleware like the Intel Thread Director, and can make do with OS scheduler-level optimizations that it can co-develop with Microsoft or the Linux community, much like it did for older versions of the "Zen" microarchitecture that featured multiple CCXs.

The theory also predicts that AMD might build on the 3D Vertical Cache technology. The next-gen CCD might feature two layers, the bottom layer with CPU cores and their dedicated L2 caches; and a top layer exclusively for a 64 MB 3D Vertical Cache serving as a shared L3 cache. In the "Zen 3" 3DV Cache CCD, the 64 MB SRAM is located above the region of the CCD that typically has its 32 MB L3 cache, a relatively cooler component than the CPU cores. On the new CCD, this SRAM could be located over the region that has the low-TDP cores, pushing the high-TDP "performance" cores to the periphery of the die, with structural silicon conducting heat from these cores to the surface.

This theory is way out there, but it's plausible because AMD doesn't have a formidable low-power CPU core architecture to rival "Gracemont." and because Intel's next-gen "Raptor Lake" chips are rumored to see the addition of more E-core clusters, making the "i9-13900K" a 24-core processor, beating AMD in the core-count game. If we were to nitpick, we'd point out that the low-TDP cores take as much valuable die real-estate and transistor-count as the high-TDP cores; and die-size (i.e. wafer volumes) are a rather scarce resource these days. We'll find out in the second half of 2022.Many thanks to TheoneandonlyMrK for the tip
Source: Wccftech
Add your own comment

41 Comments on AMD Readying 16-core "Zen 4" CCDs Exclusively for the Client Segment with an Answer to Intel E-cores?

#1
MxPhenom 216
ASIC Engineer
I think we are in for some pretty awesome competition with Raptor Lake and Zen4 this year. Competition I don't think we have seen since Core 2 and Phenom days to be honest.
Posted on Reply
#2
Daven
Theorizing is moot at this point because Zen 4 is done and was just demoed at CES. Insiders and tech sites need to focus on finding leaks and verifying their accuracy. The Zen 4 config answer is out there and Lisa Su had it in her hands two days ago. Someone needs to slip into her wardrobe a suit jacket with holes in the pockets.
Posted on Reply
#3
Chrispy_
Keeping the P cores and E cores functionally identical is pretty much vital for heterogenous CPU architectures until the OS, scheduler, and (to some extent) applications themselves get a whole lot smarter.

Presumably the only things stopping AMD from making a 16P+16E product into a 32P product will be the power limits and cooling to such a small socket, as well as the desire not to cannibalise their Threadripper sales.
Posted on Reply
#4
n-ster
I wonder how well AMD evolved efficiency-wise, how would a 5800x fare against the 1700 with the same power usage?
Posted on Reply
#5
theGryphon
This is not a pipe dream at all but totally doable considering what AMD did already. Probably not Zen 4, but a Zen4+ level product. Could even be a single SKU as response to Intel 24 core...
Posted on Reply
#6
Arkz
MxPhenom 216I think we are in for some pretty awesome competition with Raptor Lake and Zen4 this year. Competition I don't think we have seen since Core 2 and Phenom days to be honest.
There was competition back then? Phenom sucked. They've had good competition since Zen+ though.
Posted on Reply
#7
Aretak
ArkzThere was competition back then? Phenom sucked. They've had good competition since Zen+ though.
Nonsense. Phenom II chips offered excellent price/performance, especially when you could often unlock extra cores on them for free. They weren't competitive with high-end Nehalem and onwards, but for a mainstream system in 2009/10 they were a fantastic option.
Posted on Reply
#8
Sykobee
AMD is readying a Zen 4c core, which is a power optimised version of the Zen 4 core.

It was meant to be paired with the Zen 5 big core in the next generation design that borrows the efficiency core idea.

It may be that it will be ready far sooner than the Zen 5 core, and can be used in a high-end Zen 4 CCD utilising 3D cache (perhaps not at Zen 4 launch) in an 8+8 configuration.

Power optimised (lower max clock) cores can use smaller transistors, less L1 cache, perhaps even thinner vector units, to save die space.

Additionally, TSMC N5 is 1.8x denser for logic than N7/N6, but only 1.2x denser for SRAM (cache), so moving that onto a stacked N7 3D cache die makes a huge amount of sense, freeing up a lot of die space for slightly space/power optimised efficiency cores.

When you consider existing CPUs, we already see that the max all-core clock is far lower (primarily for power reasons) than the single/dual core turbo. So why have every core in the CCD able to reach that single core turbo speed? Once you realise this, AMD's plan makes a lot of sense. Intel's less so, as their efficiency cores are different designs, and have disabled AVX512 as a result in the big cores.
Posted on Reply
#9
TheoneandonlyMrK
I think this looks to be the best way to do big little per say but.
I am still not sold on big little though, better power gating and frequency control should to me, logically make them pointless.

We will see though.

@btarunr one point though, it's a schematic , it doesn't show die size's, it's possible there's a difference in size, and features still technically, I'm sure they shaved some excess off but then again those low power cores beat a 5800X allegedly so who knows.

Good times though as someone else said,. Competition, fantastic.
Posted on Reply
#10
_Flare
I personally think this theory is nonsense, because AMD wont make 3 different CCDs, one for Bergamo pure Zen4c, one pure normal Zen4, and this mixed one with expensive V-Cache ?
And the Zen4c in Bergamo exists only because the smaller area and the target market wich has workload that doesnt need big caches.
Efficiency-wise the "normal" Zen4 can go to very low TDPs AND to high frequency, no need for Zen4c cores here, because there are virtually no area restrictions for those plattforms in its segments.

I think AMD could use Zen4c cores in SoCs like 5G stations intel targets with its 24-core Tremont and Gracemont offerings.
Posted on Reply
#11
TheoneandonlyMrK
_FlareI personally think this theory is nonsense, because AMD wont make 3 different CCDs, one for Bergamo pure Zen4c, one pure normal Zen4, and this mixed one with expensive V-Cache ?
And the Zen4c in Bergamo exists only because the smaller area and the target market wich has workload that doesnt need big caches.
Efficiency-wise the "normal" Zen4 can go to very low TDPs AND to high frequency, no need for Zen4c cores here, because there are virtually no area restrictions for those plattforms in its segments.

I think AMD could use Zen4c cores in SoCs like 5G stations intel targets with its 24-core Tremont and Gracemont offerings.
Your answering like you don't know they're still making Zen 2 & 3 cores plus all the new ones.

If you sell a enterprise part, and all these were in enterprise parts , your obliged to support that part for sometimes 10+ years.

Plus I think they're upto at least 6 concurrent core designs in production now soo.
Posted on Reply
#12
RedBear
It kind of sounds like wishful thinking from AMD fans who are (rightly or wrongly) scared by Intel's hybrid approach.
If we were to nitpick, we'd point out that the low-TDP cores take as much valuable die real-estate and transistor-count as the high-TDP cores; and die-size (i.e. wafer volumes) are a rather scarce resource these days. We'll find out in the second half of 2022.
Hypothetically speaking, as long as they can keep the first place in benchmarks and reviews they could simply make few parts and ask for an arm and a leg for them. I'm looking forward to how they're going to price the 5800X3D in order to test this hypothesis.
Posted on Reply
#13
TheoneandonlyMrK
RedBearIt kind of sounds like wishful thinking from Intel fans who are (rightly or wrongly) scared by Amd's hybrid approach.

Hypothetically speaking, as long as they can keep the first place in benchmarks and reviews they could simply make few parts and ask for an arm and a leg for them. I'm looking forward to how they're going to price the i7 12800KS in order to test this hypothesis.
Fixed that for you, ironic as shit the real world where both do the same shit for benches, completely different ways, but the tudes not warranted, you could be wrong.
Posted on Reply
#14
Wirko
WirkoHaving a CPU like the one Lisa Su showed off would be nice. The cores on the chiplet with 3D cache would act as "Performance" cores, and those on the chiplet without it would be "Efficient" cores.
So, after all, AMD kept chewing on that prototype in Lisa's hand, and then did their P+E thing this way.
Posted on Reply
#16
ratirt
Hmm. I dont know what to think about it honestly. Intel gives slower cores and now AMD gives slower cores and everyone is celebrating tech advancement.
Seems weird to me though. They cant pull off full speed cores with balanced power so they wrap their CPUs in a nice big.little scheme and go with it.

I don't know but that is my impression.
Posted on Reply
#17
Wirko
ratirtHmm. I dont know what to think about it honestly. Intel gives slower cores and now AMD gives slower cores and everyone is celebrating tech advancement.
Seems weird to me though. They cant pull off full speed cores with balanced power so they wrap their CPUs in a nice big.little scheme and go with it.

I don't know but that is my impression.
Price per transistor is not going down any longer with new nodes. Same for power density. It's possible that it never will again. So, with a limited number of transistors that can be stuck into a $500 chip, they're inventing new tricks like these.
Posted on Reply
#18
TheGuruStud
ArkzThere was competition back then? Phenom sucked. They've had good competition since Zen+ though.
Phenom II was good, though. I had a 940 at 4GHz forever. It never game me an issue and ran games beautifully even in xfire.
Everything was behind, b/c no money thanks to a big blue criminal.
Posted on Reply
#19
Why_Me
ratirtHmm. I dont know what to think about it honestly. Intel gives slower cores and now AMD gives slower cores and everyone is celebrating tech advancement.
Seems weird to me though. They cant pull off full speed cores with balanced power so they wrap their CPUs in a nice big.little scheme and go with it.

I don't know but that is my impression.
The 'big-little scheme' seems to be working.

www.newegg.com/intel-core-i7-12700f-core-i7-12th-gen/p/N82E16819118359
Intel Core i7-12700F $329.99

www.newegg.com/amd-ryzen-7-5800x/p/N82E16819113665
AMD Ryzen 7 5800X $378.98

Posted on Reply
#20
ratirt
Why_MeThe 'big-little scheme' seems to be working.

www.newegg.com/intel-core-i7-12700f-core-i7-12th-gen/p/N82E16819118359
Intel Core i7-12700F $329.99

www.newegg.com/amd-ryzen-7-5800x/p/N82E16819113665
AMD Ryzen 7 5800X $378.98

It is working. Glad you find your CPUs working. It would have been hardly possible for them not to work. My concern is different. AMD will offer less for the same price. Didn't the Intel fans say "moar coars"? Everywhere you look. AMD will lower the performance of cores but increase the number of them but still the performance for those will be weaker and that is supposed to be an improvement.
These work because the software development must adjust. With monopoly for a product everyone has to adjust to the change either good or bad. You think with duopoly it's different?
The cores will perform worse to save power but there will be more of them. I'm just point it out not saying this will not work. It will work but the point still stands. They will offer less for the same price cause I doubt the prices will go down for those chips.
Posted on Reply
#21
Tsukiyomi91
good to see there's competition again between these two. But, with how AMD shifted from being the darlings of budget builders to "I don't care about you as long I make profit", I don't think this would impact Intel in a big way since Alder Lake is still gonna be more "affordable" considering the lower tier chipset such as B660 and H670 offering PCIe Gen5 and has DDR4 backward compatibility. Not sure how AMD will tackle on those aspects but something tells me that they're focusing more on PCIe Gen5 and DDR5, which will make them the more expensive brand (again) when it comes to building a new PC...
Posted on Reply
#22
Jism
I prefer to have full blown high end CPU cores from AMD and not a mix in between. I mean if power is such an issue, i just hit power saving in Windows (which i do when i do low priority stuff anyways).
ratirtHmm. I dont know what to think about it honestly. Intel gives slower cores and now AMD gives slower cores and everyone is celebrating tech advancement.
Seems weird to me though. They cant pull off full speed cores with balanced power so they wrap their CPUs in a nice big.little scheme and go with it.

I don't know but that is my impression.
It has'nt bin a tech innovation at all. This has bin in effect for mobile phones quite some time. And some use case applications it works.

The idea is that you dont need a ferrari engine to do your grocery's. A simple one litre 3 cilinder engine could do the same task but way more efficient.

But i prefer all fast cores instead of that half mixed up stuff. If power is an issue hit the power saving feature and done.
Posted on Reply
#23
_Flare
yeah, you are right Mr.K and i never want to offend anybody but want to understand the whole way of the decision AMD could take,
so in regard of the actual concurrently produced die diversity, of course you are right, too.

nevertheless the ~10 year support/supply of these dies makes a decision on wich design to produce necessary and thus these will take a percentage of the available production lines at TSMC or elsewhere.
so, yes i think 1 die of each would be sufficient, Zen4, Zen4c, and some monolithic big APU and one small like Athlon 300 maybe.

The picture you show, where the whole L3 (maybe? including L3-Control, L3-Tags and L2-ShadowTags) are only in the V-Cache-Layer. I find that problematic.
And the area the Zen4c cores take in your picture, again seem not necessary to me in the layout you chose,
because in the Zen3 layout below full equally big cores are placeable easily where the L3 sits normally.
So for me it makes nearly no sense area wise or efficiency wise.
But obviously i could be be totally wrong of course.



additonally i think of a possible approach of a 5950X successor containing 1 CCD housing one Zen4 8-core complex, PLUS 1 Bergamo-style CCD housing 16 Zen4c cores.
making it a dual CCD AM5 CPU with 24 Cores and 48 Threads
best thing of that approarch would be that no "special hybrid CCD" is needed
Posted on Reply
#24
TheoneandonlyMrK
ratirtIt is working. Glad you find your CPUs working. It would have been hardly possible for them not to work. My concern is different. AMD will offer less for the same price. Didn't the Intel fans say "moar coars"? Everywhere you look. AMD will lower the performance of cores but increase the number of them but still the performance for those will be weaker and that is supposed to be an improvement.
These work because the software development must adjust. With monopoly for a product everyone has to adjust to the change either good or bad. You think with duopoly it's different?
The cores will perform worse to save power but there will be more of them. I'm just point it out not saying this will not work. It will work but the point still stands. They will offer less for the same price cause I doubt the prices will go down for those chips.
Yeh except your wrong the cores perform better, even Intel's E cores are skylake grade and they are still equipped with bigger cores than ever before soo
_Flareyeah, you are right Mr.K and i never want to offend anybody but want to understand the whole way of the decision AMD could take,
so in regard of the actual concurrently produced die diversity, of course you are right, too.

nevertheless the ~10 year support/supply of these dies makes a decision on wich design to produce necessary and thus these will take a percentage of the available production lines at TSMC or elsewhere.
so, yes i think 1 die of each would be sufficient, Zen4, Zen4c, and some monolithic big APU and one small like Athlon 300 maybe.

The picture you show, where the whole L3 (maybe? including L3-Control, L3-Tags and L2-ShadowTags) are only in the V-Cache-Layer. I find that problematic.
And the area the Zen4c cores take in your picture, again seem not necessary to me in the layout you chose,
because in the Zen3 layout below full equally big cores are placeable easily where the L3 sits normally.
So for me it makes nearly no sense area wise or efficiency wise.
But obviously i could be be totally wrong of course.



additonally i think of a possible approach of a 5950X successor containing 1 CCD housing one Zen4 8-core complex, PLUS 1 Bergamo-style CCD housing 16 Zen4c cores.
making it a dual CCD AM5 CPU with 24 Cores and 48 Threads
best thing of that approarch would be that no "special hybrid CCD" is needed
True, but those aren't pics created by me.
I just passed a link on.
The L3 cache is not on the chip it's a vcache slice placed on top apparently but it's all rumours though I get your points, in all honesty I expected what your suggestions stated before this leak but I think it's possible this could work well.

I do still prefer just big guns(core's) and no knives(Ecores)personally.
Posted on Reply
#25
Wirko
_FlareThe picture you show, where the whole L3 (maybe? including L3-Control, L3-Tags and L2-ShadowTags) are only in the V-Cache-Layer. I find that problematic.
And the area the Zen4c cores take in your picture, again seem not necessary to me in the layout you chose,
because in the Zen3 layout below full equally big cores are placeable easily where the L3 sits normally.
So for me it makes nearly no sense area wise or efficiency wise.
I find it problematic for another reason.
With V-Cache as we know it now, AMD can choose between two variants: a "normal" one-layer package and a "super-cached" two-layer package, considerably more expensive and with a thermal tradeoff. This affords them quite a lot of flexibility. I'm sure each of them will find its place in servers and HPC clusters.
Now with this new proposed configuration, only the latter can exist. That's unless AMD also puts a part of L3 cache, certainly slower, on the I/O die.
Posted on Reply
Add your own comment