• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD "Milan-X" Processor Could Use Stacked Dies with X3D Packaging Technology

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,231 (0.91/day)
AMD is in a constant process of processor development, and there are always new technologies on the horizon. Back in March of 2020, the company has revealed that it is working on new X3D packaging technology, that integrated both 2.5D and 3D approaches to packing semiconductor dies together as tightly as possible. Today, we are finally getting some more information about the X3D technology, as we have the first codename of the processor that is featuring this advanced packaging technology. According to David Schor, we have learned that AMD is working on a CPU that uses X3D tech with stacked dies, and it is called Milan-X.

The Milan-X CPU is AMD's upcoming product designed for data center usage. The rumors suggest that the CPU is designed for heavy bandwidth and presumably a lot of computing power. According to ExecutableFix, the CPU uses a Genesis-IO die to power the connectivity, which is an IO die from EPYC Zen 3 processors. While this solution is in the works, we don't know the exact launch date of the processor. However, we could hear more about it in AMD's virtual keynote at Computex 2021. For now, take this rumor with a grain of salt.


View at TechPowerUp Main Site
 
Joined
Dec 3, 2014
Messages
338 (0.10/day)
Location
Marabá - Pará - Brazil
System Name KarymidoN TitaN
Processor AMD Ryzen 7 5700X
Motherboard ASUS TUF X570
Cooling Custom Watercooling Loop
Memory 2x Kingston FURY RGB 16gb @ 3200mhz 18-20-20-39
Video Card(s) MSI GTX 1070 GAMING X 8GB
Storage Kingston NV2 1TB| 4TB HDD
Display(s) 4X 1080P LG Monitors
Case Thermaltake Core V71
Power Supply Corsair TX 600
Mouse Logitech G300S
Intel RN:
That glue is Illegal and i don't accept the result of this election #notmyglue #MakeProcessorsGreatAgain

Jokes Aside, looks cool i hope it doens't come at a latency cost or huge power draw/temps, but its for servers so they can figure it out a way to cool it.
 
Joined
Nov 23, 2020
Messages
540 (0.43/day)
Location
Not Chicago, Illinois
System Name Desktop-TJ84TBK
Processor Ryzen 5 3600
Motherboard Asus ROG Strix B350-F Gaming
Cooling ARCTIC Liquid Freezer II 120mm, Noctua NF-F12
Memory B-Die 2x8GB 3200 CL14, Vengeance LPX 2x8GB 3200 CL16, OC'd to 3333 MT/s C16-16-16-32 tRC 48
Video Card(s) PNY GTX 690
Storage Crucial MX500 1TB, MX500 500GB, WD Blue 1TB, WD Black 2TB, WD Caviar Green 3TB, Intel Optane 16GB
Display(s) Sceptre M25 1080p200, ASUS 1080p74, Apple Studio Display M7649 17"
Case Rosewill CRUISER Black Gaming
Audio Device(s) SupremeFX S1220A
Power Supply Seasonic FOCUS GM-750
Mouse Kensington K72369
Keyboard Razer BlackWidow Ultimate 2013
Software Windows 10 Home 64-bit, macOS 11.7.8
Benchmark Scores are good
That's cool.
I bet this is how we'll get 128+-core Threadrippers.
"Threadripper 6990X, 256 cores, 512 threads, up to 4TB of DDR5 RAM in 16-channel, PCIe 6.0, MSRP $6000"
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
Interesting, but i wonder how they will solve the cooling issue of these chips.


Do we will see actual high clock or it will be a 256 core/32 chiplets 1 GHz low voltage CPU. can't wait to see how this will behave. Even if this can be taken with a grain of salt, we will see stacked chips in the near future.
 
Joined
Dec 30, 2010
Messages
2,098 (0.43/day)
Interesting, but i wonder how they will solve the cooling issue of these chips.


Do we will see actual high clock or it will be a 256 core/32 chiplets 1 GHz low voltage CPU. can't wait to see how this will behave. Even if this can be taken with a grain of salt, we will see stacked chips in the near future.

It's stacked in such a way that all the heat transfers easily to where it should go. Not stick in that and burn any components around it.

They use traces to remove the heat from those area's. IBM did even experiment with such traces basicly providing watercooling on nm level.

Ofcourse you can go high clocks; it's just a bit of impossible task when working with 64 cores and beyond. For high clocks you need a less dense chip if you want to preserve power consumption and / or cooling.
 
Joined
Feb 20, 2019
Messages
7,304 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
Laws of physics still apply so this approach will likely be limited by thermal density, but four stacked dies at 1GHz using 10-15W each rather than one 4GHz die running at 105W is still an improvement and for stuff that scales well with core counts there are no immediate downsides other than cost.

Ramping up clockspeed requires more voltage, and since the power is proportional to the scale of the voltage, a fast and narrow solution will always be less efficient than a (fully-utilised) slow and wide solution. That's Datacenter 101.
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
It's stacked in such a way that all the heat transfers easily to where it should go. Not stick in that and burn any components around it.

They use traces to remove the heat from those area's. IBM did even experiment with such traces basicly providing watercooling on nm level.

Ofcourse you can go high clocks; it's just a bit of impossible task when working with 64 cores and beyond. For high clocks you need a less dense chip if you want to preserve power consumption and / or cooling.
I have seen the patent on these technology. I am just not sure how close they are from mass production. Hope that you are right. But like Chrispy_ say, if they stack 4 10-15 watt chip, it's only 40-60 watt per stack. When we see what low power Zen APU and Apple M1 can do with such low TDP, they can probably do something powerful. Also, it would mean these chiplets are different than the one being used right now as they would need to interface from top and bottom.

but we will see, still unclear what these dies will be. It's not unlikely that wit would be HBM or something else.
 
Joined
Feb 26, 2020
Messages
33 (0.02/day)
I remember seeing that image as part of a larger slide from this Anandtech article from last year.

That article seemed to refer to that diagram as four compute chiplets and the stacks being HBM memory, similar to a CPU version of the Vega/Radeon VII GPU designs. Cooling an HBM stack is probably less challenging than trying to cool a stack of CPU chiplets clocked to a high frequency.

Perhaps the HBM could act as a form of L4 cache, similar to what Intel did with Broadwell's eDRAM?
 

ip2k

New Member
Joined
May 26, 2021
Messages
1 (0.00/day)
That's cool.
I bet this is how we'll get 128+-core Threadrippers.
"Threadripper 6990X, 256 cores, 512 threads, up to 4TB of DDR5 RAM in 16-channel, PCIe 6.0, MSRP $6000"
What’s funny is that if they did launch such a CPU, they could probably get $60k for it and it would be price / perf / space competitive if they could make it work with not-too-exotic cooling that OEMs could easily integrate.

The latest greatest 40-core Intel® Xeon® Platinum 8380 Processor (60M Cache, 2.30 GHz) is $6743.00 MSRP, or $168.57/core. 9282 is $13012 for 56 cores ($232.35/core) but that’s a precious-gen product so forget it. 256 cores for $60k would be an easy sell for 2U2P and a game-changer for density. Doing it in 1U2P or smaller blade systems would be just incredible.
 
Joined
Apr 12, 2013
Messages
6,750 (1.67/day)
Ramping up clockspeed requires more voltage, and since the power is proportional to the scale of the voltage, a fast and narrow solution will always be less efficient than a (fully-utilised) slow and wide solution. That's Datacenter 101.
The frequency/voltage curve for Ryzen is usually optimal in the 2~3Ghz range, different chips can have slightly different curves depending on the binning. So no AMD can increase the clock speeds a fair bit above 1GHz, without increasing power consumption or heat too much, though I'm still skeptical about the rumor & don't see how Milan X is anything but vaporware btw what about everyone's favorite HBM o_O
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
The frequency/voltage curve for Ryzen is usually optimal in the 2~3Ghz range, different chips can have slightly different curves depending on the binning. So no AMD can increase the clock speeds a fair bit above 1GHz, without increasing power consumption or heat too much, though I'm still skeptical about the rumor & don't see how Milan X is anything but vaporware btw what about everyone's favorite HBM o_O
You are right about the 2-3 GHz range, my 1GHz was more as an exageration than a real point.

But i think this will happen in the near future. This is the next big thing. I do not thing we will see a Server CPU with HBM. But i could see a custom chips with as much as possible cores + HBM to be put in a HPC chip for a future super-computer doing massive AVX-512 calculation while remaining as power efficient as possible.
 
Top