Thursday, February 27th 2020

AMD Gives Itself Massive Cost-cutting Headroom with the Chiplet Design

At its 2020 IEEE ISSCC keynote, AMD presented two slides that detail the extent of cost savings yielded by its bold decision to embrace the MCM (multi-chip module) approach to not just its enterprise and HEDT processors, but also its mainstream desktop ones. By confining only those components that tangibly benefit from cutting-edge silicon fabrication processes, namely the CPU cores, while letting other components sit on relatively inexpensive 12 nm, AMD is able to maximize its 7 nm foundry allocation, by making it produce small 8-core CCDs (CPU complex dies), which add up to AMD's target core-counts. With this approach, AMD is able to cram up to 16 cores onto its AM4 desktop socket using two chiplets, and up to 64 cores using eight chiplets on its SP3r3 and sTRX4 sockets.

In the slides below, AMD compares the cost of its current 7 nm + 12 nm MCM approach to a hypothetical monolithic die it would have had to build on 7 nm (including the I/O components). The slides suggest that the cost of a single-chiplet "Matisse" MCM (eg: Ryzen 7 3700X) is about 40% less than that of the double-chiplet "Matisse" (eg: Ryzen 9 3950X). Had AMD opted to build a monolithic 7 nm die that had 8 cores and all the I/O components of the I/O die, such a die would cost roughly 50% more than the current 1x CCD + IOD solution. On the other hand, a monolithic 7 nm die with 16 cores and I/O components would cost 125% more. AMD hence enjoys a massive headroom for cost-cutting. Prices of the flagship 3950X can be close to halved (from its current $749 MSRP), and AMD can turn up the heat on Intel's upcoming Core i9-10900K by significantly lowering price of its 12-core 3900X from its current $499 MSRP. The company will also enjoy more price-cutting headroom for its 6-core Ryzen 5 SKUs than it did with previous-generation Ryzen 5 parts based on monolithic dies.
Source: Guru3D
Add your own comment

89 Comments on AMD Gives Itself Massive Cost-cutting Headroom with the Chiplet Design

#26
Vya Domus
londiste
@Vya Domus none of these processors possibly with the exception of 64-core EPYC are physically impossible to build. Financially not viable... probably :)
No, they really were impossible to build on that process in a monolithic format. For instance the thermals would have likely been unmanageable due to density so clocks would have been lower. Not to mention that the yields would have been so bad that there was no way they could have accumulated any real volume of chips irrespective of the cost. A year after Rome is out and there are still no large chips on TSMC's first generation 7nm process, that's not a coincidence.
Posted on Reply
#27
kapone32
bug
Both architectures have pros and cons, there's no need to keep bringing that up.
I doubt clocks are influenced by the number of chips, but latency is.

Still, I'm a little surprised no one is mentioning what looks to me like the #1 space saving measure AMD took: not wasting 30% die space on an IGP.
I was objectively responding to the question. I should have specified a single core.

That actually confirms my hope that they could make an 8/16 core APU.
Posted on Reply
#28
mtcn77
kapone32
Monolithic is better for gaming at 1080P.
Intel is quick because of ringbus+unified scheduler. AMD dilates those for better multitasking.
Posted on Reply
#29
kapone32
mtcn77
Intel is quick because of ringbus+unified scheduler. AMD dilates those for better multitasking.
Which is monolithic in design. Which is MCM. As Bug said though there are pros and cons to both. New generations basically make it a moot point anyway.
Posted on Reply
#30
mtcn77
kapone32
Which is monolithic in design. Which is MCM. As Bug said though there are pros and cons to both. New generations basically make it a moot point anyway.
AMD was faster in previous generations also, eventhough there wasn't competition in that regard in between bulldozer vs sandybridge. Any monolithic chip comparison is the same, it is a confounder classification. Fact is, you can increase L3 if you displace it to the io chip.
Such things improve task switching performance. If you have united scheduler you load back the program runtime, afaik. There isn't any seperate second thread discretization.
Posted on Reply
#31
londiste
mtcn77
Fact is, you can increase L3 if you displace it to the io chip.
No, you can't. The primary reason for Ryzen 3000's increased memory latency is chiplet design and having to go over IF to IOD for memory controller. This adds 7-8 ns of latency (compared to Ryzen 2000, not Intel) to memory access. L3 cache access latency is in the tune of 10 ns...
Posted on Reply
#32
kapone32
mtcn77
AMD was faster in previous generations also, eventhough there wasn't competition in that regard in between bulldozer vs sandybridge. Any monolithic chip comparison is the same, it is a confounder classification. Fact is, you can increase L3 if you displace it to the io chip.
Such things improve task switching performance. If you have united scheduler you load back the program runtime, afaik. There isn't any seperate second thread discretization.
Yes both Intel and AMD used monolithic designs until Bulldozer which was the first iteration of multi chip design (rudimentary and latency ridden) but showed the potential of multi core as even Bulldozer was faster than Sandybridge in multi threaded apps. With Zen AMD went even further down the MCM rabbit hole and found a way to make it viable (Thanks Jim Keller). Don't be surprised if we see 128 Core consumer CPUs as a result, as the node shrinks and more chiplets can be added.
Posted on Reply
#33
mtcn77
londiste
No, you can't. The primary reason for Ryzen 3000's increased memory latency is chiplet design and having to go over IF to IOD for memory controller. This adds 7-8 ms of latency (compared to Ryzen 2000, not Intel) to memory access. L3 cache access latency is in the tune of 10 seconds...
There are tags and associativeness dividers to that search latency. Besides, you get successively better ST from that L3 increase, so it pays itself out. Same ordeal, ST vs MT performance. Ryzen is a dilation to that monolithiasis constipation.
Posted on Reply
#34
londiste
mtcn77
Besides, you get successively better ST from that L3 increase, so it pays itself out.
Do you get better ST from that L3 cache increase? We should see when Ryzen 4000 APUs get tested.

kapone32
Yes both Intel and AMD used monolithic designs until Bulldozer which was the first iteration of multi chip design (rudimentary and latency ridden) but showed the potential of multi core as even Bulldozer was faster than Sandybridge in multi threaded apps.
- Bulldozer was completely monolithic.
- Contemporary multi-chip design first probably goes to Pentium D.
Posted on Reply
#35
ARF
kapone32
But then how would we help AMD force price corrections for Intel and Nvidia products? That money will go into R&D (They are a tech company afterall). It will be interesting to see where AMD does in both the GPU and CPU space now that they are positive in terms of cash flow and revenue.
Well, but this R&D money....... since 2015 AMD hasn't launched a top GPU (R9 Fury X was the last one), and between 2011 and 2017 they hadn't been competitive in the CPU space either.

So, don't give them too much credit and trust that they use the money in a proper way.

Still waiting for RTX 2080 Ti competitor that was launched back in 2018, soon to be 2 years already.
Posted on Reply
#36
bug
ARF
Well, but this R&D money....... since 2015 AMD hasn't launched a top GPU (R9 Fury X was the last one), and between 2011 and 2017 they hadn't been competitive in the CPU space either.

So, don't give them too much credit and trust that they use the money in a proper way.

Still waiting for RTX 2080 Ti competitor that was launched back in 2018, soon to be 2 years already.
The past is past. These days AMD seems to be doing things right. They're now competitive enough on the desktop and in the server space. They still have to win mobile and GPU, let's just wait and see which one they focus on first. Cause they will ;)
Posted on Reply
#37
ppn
The second flag is we know they can produce at least 251mm2 chips, and 430 on 7nm+, that results 48 cores+IMC, by cutting the L3 cache in half they can fit the memory controller on the same die as the cores and lower the latency significantly. the price will go up, as opposed to what, making this chiplet thing with 128 CCXs and L3 copy in every die that doesnt need to exist and causes tremendous latencies all over the place. 64 core that can be replaced by 48core single CCX + single L3 +imc at half the size and still perform as good.
Posted on Reply
#38
Mats
kapone32
Yes both Intel and AMD used monolithic designs until Bulldozer which was the first iteration of multi chip design..
What..

To name a few, Intel Smithfield (2005), Kentsfield (2006), Clarkdale (2010) and desktop Broadwell (2015) are MCM, although not in the same way as Ryzen 3000.

AMD Bulldozer, Piledriver, Ryzen 1000, Ryzen 2000 are all single die
Posted on Reply
#39
bug
kapone32
Yes both Intel and AMD used monolithic designs until Bulldozer which was the first iteration of multi chip design (rudimentary and latency ridden)
You must have missed Pentium D ;)
Back then AMD mocked Intel for not using true multi-core, but "gluing together" CPUs instead. Tbh, Pentium D was horrible.
Posted on Reply
#40
Mats
bug
Back then AMD mocked Intel for not using true multi-core, but "gluing together" CPUs instead. Tbh, Pentium D was horrible.
Nobody complained about Core 2 Quad, also not true multi-core.. not even AMD. :D
Posted on Reply
#41
bug
Mats
Nobody complained about Core 2 Quad, also not true multi-core.. not even AMD. :D
Yeah, well, probably because Core 2 Quad was far from terrible? ;)
Posted on Reply
#42
kapone32
londiste
Do you get better ST from that L3 cache increase? We should see when Ryzen 4000 APUs get tested.

- Bulldozer was completely monolithic.
- Contemporary multi-chip design first probably goes to Pentium D.
bug
You must have missed Pentium D ;)
Back then AMD mocked Intel for not using true multi-core, but "gluing together" CPUs instead. Tbh, Pentium D was horrible.
:toast: That is funny considering Intel said the same thing about Zen.
Posted on Reply
#43
ARF
bug
You must have missed Pentium D ;)
Back then AMD mocked Intel for not using true multi-core, but "gluing together" CPUs instead. Tbh, Pentium D was horrible.
I use an Athlon 64 X2 to this moment and for normal everyday use, it's adequate - office things, movies, even some gaming FarCry, Crysis original...
Posted on Reply
#44
Mats
bug
Yeah, well, probably because Core 2 Quad was far from terrible? ;)
Exactly. Mocking the dual die design of the PD is really to ignore the giant elephant in the room called Netburst.
Posted on Reply
#45
bug
ARF
I use an Athlon 64 X2 to this moment and for normal everyday use, it's adequate - office things, movies, even some gaming FarCry, Crysis original...
Honestly, I feel any 2GHz CPU is good enough to browse the web and edit documents, even if you go back to a single core AthlonXP. Just look at how many people still use really, really old laptops to accomplish those tasks.
Posted on Reply
#46
mtcn77
londiste
Do you get better ST from that L3 cache increase? We should see when Ryzen 4000 APUs get tested.
It is all about cache hit rate. Higher ranks run slower and using less power to do the same job. After a set level, latency to do a full associative test becomes faster in a higher and slower more associative cache.
I had this under wraps in the Broadwell era Iris Pro lineup. Did you know Broadwell G was as quick as 7700k in ST? Considering one is hof material, the clock frequency difference between them pronounce this notion.
Posted on Reply
#47
ARF
bug
Honestly, I feel any 2GHz CPU is good enough to browse the web and edit documents, even if you go back to a single core AthlonXP. Just look at how many people still use really, really old laptops to accomplish those tasks.
Athlon 64 X2 4400+ is a serious CPU, though. It's 2.2 GHz across the cores and has 2MB L2 cache.

It's like a dual core 4.4 GHz Pentium 4 X2.

Slower dual cores like Core 2 Duo at 1.5 GHz will be horrible for browsing.

You have heavy sites like Facebook, YouTube which even if for a moment during the initial loading of the contents, use all the available resources of your CPU.
You can 100% stress an 8-core Ryzen 7 or Core i9.
Posted on Reply
#48
Jism
Raendor
So, does this mean 4000 series will be priced cheaper than current 3000 prices already or they will also discount the 3000 series further down? I was thinking about getting 3600 and moving my trusty 6700k to htpc, but if the prices will get even better, I don’t mind to hold on a bit.
Its just showing on how the MCM approach is cheaper for AMD to fabricate then the monolithic design of one big chip.

Cheaper production means more margin and thus more profit. Good for shareholders, good for consumers or enterprises, because if Intel attempts a price cut AMD can pretty much answer that and still make profit.
Posted on Reply
#49
bug
ARF
Athlon 64 X2 4400+ is a serious CPU, though. It's 2.2 GHz across the cores and has 2MB L2 cache.
It's like a dual core 4.4 GHz Pentium 4 X2.
You don't have to tell me that, I used to rock an 4200+ ;)
ARF
Slower dual cores like Core 2 Duo at 1.5 GHz will be horrible for browsing.
If you're not talking mobile, I'm pretty sure a C2D@1.5GHz will beat an X2 @~2GHz. Core was about IPC, first and foremost.
ARF
You have heavy sites like Facebook, YouTube which even if for a moment during the initial loading of the contents, use all the available resources of your CPU.
You can 100% stress an 8-core Ryzen 7 or Core i9.
Well, yeah, an ancient CPU will not give a first class experience. But throw in a script blocker (e.g. NoScript) so not everyone and their grandma will run scripts in your browser and the web becomes bearable again ;)
Posted on Reply
#50
TheGuruStud
Mats
Nobody complained about Core 2 Quad, also not true multi-core.. not even AMD. :D
I did. It was stupid expensive. I waited for phenom 2 and OCed to 4ghz. I used that CPU for several years and it ran every game flawlessly (even with xfire and didn't do too bad in multimedia either).
Posted on Reply
Add your own comment