Sunday, February 28th 2021

AMD "Genoa" Expected to Cram Up to 96 Cores, MCM Imagined

AMD's next-generation EPYC enterprise processor that succeeds the upcoming 3rd Gen EYPIC "Milan," codenamed "Genoa," is expected to be the first major platform update for AMD's enterprise platforms since the 2017 debut of the "Zen" based "Naples." Implementing the latest I/O interfaces, such as DDR5 memory and PCI-Express gen 5.0, the chip will also increase CPU core counts by 50% over "Milan," according to ExecutableFix on Twitter, a reliable source with rumors from the semiconductor industry. To enable the goals of new I/O and increased core counts, AMD will transition to a new CPU socket type, the SP5. This is a 6,096-pin land grid array (LGA), and the "Genoa" MCM package on SP5 is imagined to be visibly larger than SP3-generation packages.

With the added fiberglass substrate real-estate, AMD is expected to add more CPU chiplets to the package, and ExecutableFix expects the chiplet count to be increased to 12. AMD is expected to debut the "Zen 4" microarchitecture in the enterprise space with "Genoa," with the CPU chiplets expected to be built on the 5 nm EUV silicon fabrication node. Assuming the chiplets still only pack 8 cores a piece, "Genoa" could cram up to 96 cores per socket, or up to 192 logical processors, with SMT enabled.
The sIOD die (I/O complex) is another component with major changes. AMD is expected to increase the memory bus width by 50%, with the processor now expected to come with as 12-channel DDR5 memory interface, natively supporting DDR5-5333. The PCI-Express lane budget appears unchanged, with up to 128 lanes per socket, but implements the latest PCI-Express gen 5.0, which is expected to double bandwidth over the current PCIe gen 4.0. In a bid to reduce the die-size of the sIOD, and more importantly its TDP, AMD might finally build it on newer silicon fabrication nodes, such as 7 nm. ExecutableFix expects the overall TDP of "Genoa" to be around 320 W, configurable up to 400 W.

AMD is expected to debut "Genoa" only by mid/late 2022, as it is yet to monetize the 3rd Gen EPYC "Milan."
Sources: ExecutableFix (Twitter) 1, ExecutableFix (Twitter) 2, via Videocardz
Add your own comment

26 Comments on AMD "Genoa" Expected to Cram Up to 96 Cores, MCM Imagined

#1
Mussels
Moderprator
Look at all these big juicy numbers
Posted on Reply
#4
windwhirl
evernessinceWhat the heck, that thing is a monster.
Monster CPU to deal with monster tasks.
Posted on Reply
#5
bobbybluz
MusselsLook at all these big juicy numbers
I've got a 22c 44t Xeon coming tomorrow. Considering my age and a few other factors it may be the last CPU I ever buy. I have no clue as to what excuse I could find for buying a 96c CPU.
Posted on Reply
#6
Patriot
bobbybluzI've got a 22c 44t Xeon coming tomorrow. Considering my age and a few other factors it may be the last CPU I ever buy. I have no clue as to what excuse I could find for buying a 96c CPU.
Probably won't... but they will probably have 16-24c starting models that would give power savings over what you currently use.

Core count is just one reason to upgrade, consolidation of many boxes to few for power savings is another, and more of everything else... I/O memory... w/e.

I am just happy to see AMD adopting OAM for compute and OCP for nics, its nice to see open standards being used.
Frontier or El Capitan MI200 OAM + Genoa
Posted on Reply
#7
Gmr_Chick
Every time I see these chunky CPUs, I think of this song :D

Posted on Reply
#8
Mussels
Moderprator
I just cant help but think how much AMD has changed the server world

What, 5 years ago we had 8 core CPU's in the server space and now thanks to AMD that numbers skyrocketing - and we all benefit in the background from faster, more power efficient servers
Posted on Reply
#9
watzupken
MusselsI just cant help but think how much AMD has changed the server world

What, 5 years ago we had 8 core CPU's in the server space and now thanks to AMD that numbers skyrocketing - and we all benefit in the background from faster, more power efficient servers
I feel it is not just the server market that has benefited from AMD's push for more cores. For the longest time, Intel have been selling consumers dual and quad core processors. Anything more than that, be prepared to pay an absurd premium to Intel. That changed when AMD released Ryzen. This also applies for the professional and enterprise space.
Posted on Reply
#10
londiste
MusselsI just cant help but think how much AMD has changed the server world

What, 5 years ago we had 8 core CPU's in the server space and now thanks to AMD that numbers skyrocketing - and we all benefit in the background from faster, more power efficient servers
EPYC still has 8-core models available.
5 years ago Intel had Broadwell-based Xeons? 4-24 cores.
Bulldozer/Piledriver Opterons had 4-16 cores.
Posted on Reply
#11
PooPipeBoy
MusselsI just cant help but think how much AMD has changed the server world

What, 5 years ago we had 8 core CPU's in the server space and now thanks to AMD that numbers skyrocketing - and we all benefit in the background from faster, more power efficient servers
One thing I'm glad to see is that simultaneous multi-threading is no longer used for market segmentation. Now every consumer processor has SMT and you don't need to pay extra to get it. That's a great thing these days when SMT is more effective and has less performance overhead.
Posted on Reply
#12
Wirko
Have we seen any confirmation that it's the number of chiplets that's going up, not the number of cores on each chiplet?

Cramming 12 cores on chiplet would make sense on 5 nm if the yield is good enough - and thanks to Apple's 1-year headstart, it will probably be good enough. It would reduce the amount of (slow) chiplet-to-chiplet communication and the complexity of the (already giant) I/O die.
Posted on Reply
#13
TumbleGeorge
Why so many watts? 5nm is more power efficient. And for 96 core models maybe base frequency will be little reduced.
Posted on Reply
#14
Mussels
Moderprator
londisteEPYC still has 8-core models available.
5 years ago Intel had Broadwell-based Xeons? 4-24 cores.
Bulldozer/Piledriver Opterons had 4-16 cores.
24 to 96 is one hell of a change
Posted on Reply
#15
londiste
Mussels24 to 96 is one hell of a change
Yes, but it is also 6 years.
24 in 2016, 32 in 2017, 64 in 2019, 96 in 2022.

Before that it was 18 in 2014, 15 in 2014 and 8 back in 2012.
AMD had Opterons since 2011-2012 that were 16 thread, 8 module CMT.
Posted on Reply
#16
DeathtoGnomes
I'm still waiting for the day when end-users can install extra chiplets themselves.
Posted on Reply
#17
londiste
DeathtoGnomesI'm still waiting for the day when end-users can install extra chiplets themselves.
Why would an end user do that? Chiplet requirements for connection and such are way too tight for any reasonable way of adding them.
But larger picture - you have had that possibility for a while but not necessarily for end-users in customer space. Multi-socket servers are an implementation of exactly the same idea.

You can have 2x 64-core EPYCs in one system today. And if you happen to operate a nuclear power plant, 8x 28-core Xeons :D
Posted on Reply
#18
Aquinus
Resident Wat-man
PooPipeBoyOne thing I'm glad to see is that simultaneous multi-threading is no longer used for market segmentation. Now every consumer processor has SMT and you don't need to pay extra to get it. That's a great thing these days when SMT is more effective and has less performance overhead.
Nah, they still do that. Just not as often. I just bought my wife a HP laptop with a 4700u in it and it's 8c/8t.
Posted on Reply
#19
Mussels
Moderprator
londisteYes, but it is also 6 years.
24 in 2016, 32 in 2017, 64 in 2019, 96 in 2022.

Before that it was 18 in 2014, 15 in 2014 and 8 back in 2012.
AMD had Opterons since 2011-2012 that were 16 thread, 8 module CMT.
before that we went from 1 cores to 4 cores in like 20 damn years
Posted on Reply
#20
Aquinus
Resident Wat-man
Musselsbefore that we went from 1 cores to 4 cores in like 20 damn years
Yes, but older computers relied on there being just one core. AGP plus Multicore CPU anyone? Once we started seeing dual cores, we started seeing quads not too long after. Even AMD kept going up to 6c with the Phenom II chips near the end. It's the stretch to get beyond 4 cores that took a really long time before going mainstream and this was the time that AMD started flailing in the CPU market and progress essentially stagnated. It wasn't until recently that AMD made >4c realistic from a cost and power budget standpoint, but as soon as that happened it started becoming much more normal to see 6c and 8c chips not just in desktops, but in laptops too. You also have to consider power consumption numbers which have improved quite a bit over the last decade.

Either way, it's a little more complicated than just getting more cores. There have been process improves, architectural improvements, improvement in design software to build these chips, etc. I'm just glad that it's normal to see 8c chips in laptops these days.
Posted on Reply
#21
pjl321
I'm disappointed that the CCD is remaining 8 cores and they are just increasing the number of chiplets rather than moving to 12 core CCDs like originally rumoured.
I wonder what this will mean for AM5 chips?
Posted on Reply
#22
TumbleGeorge
pjl321I'm disappointed that the CCD is remaining 8 cores and they are just increasing the number of chiplets rather than moving to 12 core CCDs like originally rumoured.
I wonder what this will mean for AM5 chips?
Why not? More chiplets=more infinity fabrics=more latencies and more outer chiplet complexity=more pins=more price and glory for Intel? :D
Posted on Reply
#23
InVasMani
londisteWhy would an end user do that? Chiplet requirements for connection and such are way too tight for any reasonable way of adding them.
But larger picture - you have had that possibility for a while but not necessarily for end-users in customer space. Multi-socket servers are an implementation of exactly the same idea.

You can have 2x 64-core EPYCs in one system today. And if you happen to operate a nuclear power plant, 8x 28-core Xeons :D
Honestly why couldn't they put multiple chips each on a substrate and fit them into the same socket and latch them? That would actually be great in fact I think it would be more ideal. You might have 4-8 of them all in row or column running alongside the memory. Which could mean shorter traces to the CPU how nice right? Plus with bank groups on memory chips maybe each chip substrate running along side them closely taps into them directly. You could also mix and match the chips on the substrate CPU, APU, or FPGA to taste. So long as they side link and get registered as one physical chip what's the problem? Perhaps they'd need to also sit on top of another substrate to make them all seen side by side as one physical chip entity by the OS itself not sure.

The plausibility of it though it quite cool to think about. I've actually said similar before on these forums, but I hadn't quite envisioned it the way I just described to the same extent and the more I think about it to me that seems one of the most optimal ways to do it. I think mounting the GPU's more close to the CPU/Memory would also be beneficial. Which with GPU's going MCM could be done the same way on the opposite side of the memory banks. You might channel interleave them even GPU/CPU/GPU/CPU along with the memory itself. If AMD were to do that I'd suggest a black, red, and grey substrate perhaps to differentiate between CPU/GPU/FPGA chips.

What I've mentioned before is more of a return to the older arcade board chip rom socketed setup which really this would be a condensed form of actually, but it would be clean with short memory traces and highly adaptable based on what sockets you integrate for what reasons.
Posted on Reply
#24
Mussels
Moderprator
pjl321I'm disappointed that the CCD is remaining 8 cores and they are just increasing the number of chiplets rather than moving to 12 core CCDs like originally rumoured.
I wonder what this will mean for AM5 chips?
this isnt an official post, its just a concept - so we may still see that
Posted on Reply
#25
pjl321
Musselsthis isnt an official post, its just a concept - so we may still see that
It certainly makes more sense to me to increase the core count within the CCD rather than add more chiplets from a performance, latency, etc standpoint. But maybe it is really hard to do I don't know.
Posted on Reply
Add your own comment
Copyright © 2004-2021 www.techpowerup.com. All rights reserved.
All trademarks used are properties of their respective owners.