• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Instinct MI200 to Launch This Year with MCM Design

AleksandarK

Staff member
Joined
Aug 19, 2017
Messages
923 (0.72/day)
AMD is slowly preparing the next-generation of its compute-oriented flagship graphics card design called Instinct MI200 GPU. It is the card of choice for the exascale Frontier supercomputer, which is expected to make a debut later this year at the Oak Ridge Leadership Computing Facility. With the supercomputer planned for the end of this year, AMD Instinct MI200 is also going to get launched eight a bit before or alongside it. The Frontier exascale supercomputer is supposed to bring together AMD's next-generation Trento EPYC CPUs with Instinct MI200 GPU compute accelerators. However, it seems like AMD will utilize some new technologies for the making of this supercomputer. While we do not know what Trento EPYC CPUs will look like, it seems like Instinct MI200 GPU is going to feature a multi-chip-module (MCM) design with the new CDNA 2 GPU architecture. With this being the only information about the GPU, we have to wait a bit to find out more details.


View at TechPowerUp Main Site
 
Joined
May 2, 2017
Messages
3,654 (2.61/day)
Location
Norwegian, currently in Lund, Sweden
Processor AMD Ryzen 5 1600X
Motherboard Biostar X370GTN
Cooling Custom CPU+GPU water loop
Memory 16GB G.Skill TridentZ DDR4-3200 C16
Video Card(s) AMD R9 Fury X
Storage 500GB 960 Evo (OS ++), 500GB 850 Evo (Games)
Display(s) Dell U2711
Case NZXT H200i
Power Supply EVGA Supernova G2 750W
Mouse Logitech G602
Keyboard Lenovo Compact Keyboard with Trackpoint
Software Windows 10 Pro
Hm, I wonder what the specs of this will be. The MI100 does 23.1TF FP32/46.2TF Matrix FP32/184.6TF FP16/INT8/INT4 (I'm really curious how they manage that 4x FP16 rate in the same hardware!) with 120CUs at 1.5GHz. Logic says this will then be two 120CU dice at the same clock speed, though given the 300W TBP of the MI100 (let's say 260W of that is GPU power, with VRM losses and HBM2 accounting for the rest), cooling two of those dice in an MCM layout will be very, very difficult. There are two reasonable solutions: wider, lower clocked dice (unlikely due to die size (Arcturus is already reportedly 750mm²!) cost and yields) and a significant jump in efficiency. My money's on the latter (not that RDNA developments are necessarily transferable to CDNA, but AMD's GPU division is definitely on an efficiency roll these days), but it'll nonetheless be very interesting to see what this ends up being.
 
Joined
Sep 1, 2020
Messages
272 (1.52/day)
Location
Bulgaria
5nm? When talking for supercomputer and especially for supercomputer for USA has not manufacturing shortages.
 
Joined
May 2, 2017
Messages
3,654 (2.61/day)
Location
Norwegian, currently in Lund, Sweden
Processor AMD Ryzen 5 1600X
Motherboard Biostar X370GTN
Cooling Custom CPU+GPU water loop
Memory 16GB G.Skill TridentZ DDR4-3200 C16
Video Card(s) AMD R9 Fury X
Storage 500GB 960 Evo (OS ++), 500GB 850 Evo (Games)
Display(s) Dell U2711
Case NZXT H200i
Power Supply EVGA Supernova G2 750W
Mouse Logitech G602
Keyboard Lenovo Compact Keyboard with Trackpoint
Software Windows 10 Pro
5nm? When talking for supercomputer and especially for supercomputer for USA has not manufacturing shortages.
Sounds likely. TSMC has had 5nm in mass production since around late summer 2020, so it stands to reason that a supercomputer aiming for a late 2021 launch should be able to make use of that. To hit Frontier's target of 1.5EFlops (I'm assuming that's FP32?) at a theoretical 2xMI100 speeds, that's 16234 GPUs. Some of that compute number will no doubt be from the CPUs, so let's say 15 000 GPUs for an easy, round number. Assuming that TSMC 5nm lives up to its promised 45% area reduction over 7nm and that the layout is identical to Arcturus, and using some napkin math based on the measurements of published Arcturus die shots (750mm², 1:1.24 aspect ratio) we have two ~410mm² dice per MI200 with measurements of ~18.2*22.5mm. Plugging that into a wafer calculator gives us 256 dice per wafer, or 128 2-die MCM GPUs per wafer, though accounting yields - assuming TSMC 5nm matches their 7nm at 0.09 defects/cm² - brings that down to 105. That means they need at least 143 wafers to outfit this supercomputer with GPUs. In the grand scheme of things, that isn't a lot, not when most high-end fabs are on the scale of >20 000 wafer starts/month. Producing the AICs is likely much more of a challenge than making the silicon itself.
 
Joined
Mar 21, 2016
Messages
910 (0.50/day)
This is kind of how I envisioned things would shape up for AMD since Ryzen. I knew Ryzen had to be a top priority along with getting debt under control and Radeon would have a lull period, but over time gradually and increasingly become more competitive with Nvidia in due time. I figured it would ramp up over time personally. If anything they've been a bit more successful than I'd have initially thought 3 or 4 years back though because Intel has had some real roadblocks and stumbles that's actually helped AMD's situation. I really can't wait to see what will transpire with MCM designs for GPU's from AMD as well as Nvidia and even Intel for that matter which I can see taking a different approach than the others perhaps hard to say.
 
Top