Wednesday, January 5th 2022

AMD Readying 16-core "Zen 4" CCDs Exclusively for the Client Segment with an Answer to Intel E-cores?
AMD already declared the CPU core counts of its EPYC "Genoa" and "Bergamo" processors to top out at 96 and 128, respectively, a core-count believed to have been facilitated by the larger fiberglass substrate of the next-gen SP5 CPU socket, letting AMD add more 8-core "Zen 4" chiplets, dubbed CPU complex dies (CCDs). Until now, AMD has used the chiplet as a common component between its EPYC enterprise and Ryzen desktop processors, to differentiate CPU core counts.
A fascinating theory that hit the rumor-mill, indicates that the company might leverage 5 nm (TSMC N5) carve out larger CCDs with up to 16 "Zen 4" CPU cores. Half of these cores are capped at a much lower power budget, essentially making them efficient-cores. This is a concept AMD appears to be carrying over from its 15-Watt class mobile processors, which see the CPU cores operate under an aggressive power-management. These cores still turn out a reasonable amount of performance, and are functionally identical to the ones on 105 W desktop processors with a relaxed power budget.Since the "fat" and "slim" cores are functionally identical to each other; AMD need not develop a complex middleware like the Intel Thread Director, and can make do with OS scheduler-level optimizations that it can co-develop with Microsoft or the Linux community, much like it did for older versions of the "Zen" microarchitecture that featured multiple CCXs.
The theory also predicts that AMD might build on the 3D Vertical Cache technology. The next-gen CCD might feature two layers, the bottom layer with CPU cores and their dedicated L2 caches; and a top layer exclusively for a 64 MB 3D Vertical Cache serving as a shared L3 cache. In the "Zen 3" 3DV Cache CCD, the 64 MB SRAM is located above the region of the CCD that typically has its 32 MB L3 cache, a relatively cooler component than the CPU cores. On the new CCD, this SRAM could be located over the region that has the low-TDP cores, pushing the high-TDP "performance" cores to the periphery of the die, with structural silicon conducting heat from these cores to the surface.
This theory is way out there, but it's plausible because AMD doesn't have a formidable low-power CPU core architecture to rival "Gracemont." and because Intel's next-gen "Raptor Lake" chips are rumored to see the addition of more E-core clusters, making the "i9-13900K" a 24-core processor, beating AMD in the core-count game. If we were to nitpick, we'd point out that the low-TDP cores take as much valuable die real-estate and transistor-count as the high-TDP cores; and die-size (i.e. wafer volumes) are a rather scarce resource these days. We'll find out in the second half of 2022.Many thanks to TheoneandonlyMrK for the tip
Source:
Wccftech
A fascinating theory that hit the rumor-mill, indicates that the company might leverage 5 nm (TSMC N5) carve out larger CCDs with up to 16 "Zen 4" CPU cores. Half of these cores are capped at a much lower power budget, essentially making them efficient-cores. This is a concept AMD appears to be carrying over from its 15-Watt class mobile processors, which see the CPU cores operate under an aggressive power-management. These cores still turn out a reasonable amount of performance, and are functionally identical to the ones on 105 W desktop processors with a relaxed power budget.Since the "fat" and "slim" cores are functionally identical to each other; AMD need not develop a complex middleware like the Intel Thread Director, and can make do with OS scheduler-level optimizations that it can co-develop with Microsoft or the Linux community, much like it did for older versions of the "Zen" microarchitecture that featured multiple CCXs.
The theory also predicts that AMD might build on the 3D Vertical Cache technology. The next-gen CCD might feature two layers, the bottom layer with CPU cores and their dedicated L2 caches; and a top layer exclusively for a 64 MB 3D Vertical Cache serving as a shared L3 cache. In the "Zen 3" 3DV Cache CCD, the 64 MB SRAM is located above the region of the CCD that typically has its 32 MB L3 cache, a relatively cooler component than the CPU cores. On the new CCD, this SRAM could be located over the region that has the low-TDP cores, pushing the high-TDP "performance" cores to the periphery of the die, with structural silicon conducting heat from these cores to the surface.
This theory is way out there, but it's plausible because AMD doesn't have a formidable low-power CPU core architecture to rival "Gracemont." and because Intel's next-gen "Raptor Lake" chips are rumored to see the addition of more E-core clusters, making the "i9-13900K" a 24-core processor, beating AMD in the core-count game. If we were to nitpick, we'd point out that the low-TDP cores take as much valuable die real-estate and transistor-count as the high-TDP cores; and die-size (i.e. wafer volumes) are a rather scarce resource these days. We'll find out in the second half of 2022.Many thanks to TheoneandonlyMrK for the tip
41 Comments on AMD Readying 16-core "Zen 4" CCDs Exclusively for the Client Segment with an Answer to Intel E-cores?
Phenom II couldn't compete either due to cheap Q6600 and was slower as well.
Now I did see a poster above talking about a slimmed down "Zen 4C". Some quick optimizations could keep the uarchs fundamentally similar while drastically cutting down on the die area through elimination of some of the logic that's relatively expensive for relatively little return (like AVX512 on ADL) and does not have substantial impacts on feature support (unlike AVX512 on ADL). That I can believe and I would of course be surprised if such a thing was not in the works. The big.LITTLE thing actually does work out fairly well for consumers, since very few people need the full ST performance on more than 8 cores, while per-core performance is mostly irrelevant to the highly parallel workloads and more smaller cores crank out more throughput at the same cost to power/area, and the execution turned out to be less rocky than I and others had anticipated.
www.pccasegear.com/products/52253/amd-ryzen-7-5800x-processor
www.pccasegear.com/products/56718/intel-core-i7-12700f-processor
and if your lucky you can find the 5800x on AUS Amazon for as low as $450 (last week actually)
and what is this! your linking Hardware unboxed? are you feeling ok? you hate those guys! Yeah wrong sorry dude, Phenom II destroyed Core 2 in both price and performance back then it was only when Core i whatever came out that AMD had no real competition for intels high end and never recovered till Zen. But Phenom II came out late and it came against the end of the Core 2 lineup and the begining of the Core Series so it never really looked great, but was a good CPU if you couldnt afford Core i 920 etc
@Why_Me your calling HWunboxed a shill , now that's f£#@ing ironic, you may not be,I don't know but your actions are considerably more schill like them HWUB IMHO.
The only reason why Intel needed Gracemont is because their P-core is quite area inefficient. AMD don't have such problems. They have a much more area efficient P-core. As they said, "AMD is the only team that can deliver a performance core and an efficient core in the same core". ;) And I think that's a much better approach for the desktop, laptop and server markets. Developing a performance core and deriving a power efficient core from it.
I'm also not sure what your on, you seem to think this a fan theory.
I'm seeing you spouting fan theories about Intel's hybrid approach being better than This theory of AMDS.
I'll await release reviews before I'm that outspoken, but I am already of the opinion Intel's way has merits but also has a slight inadequacy.
And was done out of necessity.
To beat Amd in single core Intel had to build a massive core ,are they winning by much, no.
And to have any chance of competing on core count and multicore loads they couldn't fit enough big cores in.
And then there's the power use on those big guns.
wccftech.com/alleged-amd-ryzen-7000-zen-4-16-core-8-core-desktop-cpus-spotted/
videocardz.com/newz/amd-ryzen-7000-zen4-raphael-8-core-and-16-core-engineering-samples-have-been-spotted
Zen 3
Alder Lake
1x Zen 3 + L2: ~3.2 mm²
4x Gracemont + L2: ~8.8 mm² / ~2.2 mm² per core
1x Golden Cove + L2: ~7 mm² / as Locuza mentions "with the black bar" ~7.4 mm²
Now let's put these sizes into perspective according to the core performance. Unfortunately I didn't find single core benchmarks of each core capped at a certain power target. Which would be most useful to see what's possible with high scalable core designs. But ComputerBase made a detailed test where we can find some interesting numbers. Based on the average results of Cinebench R15/20/23 and POV-Ray.
8x Golden Cove, w/ SMT, 3.9 GHz: 81%
8x Golden Cove, w/o SMT, 3.9 GHz: 62%
8x Gracemont, 3.9 GHz: 43%
So, one Golden Cove core (including SMT) offers almost 90% higher performance than Gracemont at ~4 GHz.
Comparing to Zen 3 is a little more difficult.
1x Zen, 3.6 GHz: 90%
1x Golden Cove, 3.6 GHz: 100%
Unfortunately this is without the impact of SMT. But due to the fact that the 12900K is also ~10% faster in MT than the 5800K @ 3,6 GHz in this review let's assume a comparable SMT speedup for Zen 3.
Now put all the numbers together, baseline is Gracemont:
1x Gracemont (w/o SMT): 100% performance, 2.2 mm² => ~45.5% perf/mm²
1x Golden Cove (w/ SMT): 188% performance, 7 mm² (7.4 mm²) => ~26.9% (25.4%) perf/mm²
1x Zen 3 (w/ SMT): 169% performance, 3.2 mm² => ~52.8% perf/mm²
So, I was not quite correct. According to these numbers Zen 3 isn't "almost as area efficient as Intel's Gracemont". Actually Zen 3 is more area efficient than Intel's Gracemont and about twice as area efficient as Golden Cove. Of course, this is a rough comparison. So, take these numbers with a grain of salt. But it clearly shows why AMD don't need separate cores. Current Zen already is very power and area efficient. And Zen 4 will improve that by a significant margin.