• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Patents Chiplet Architecture for Radeon GPUs

Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Big question is, will it cost more.
The main reason for doing this is to reduce costs. So no. The interposer will obviously not be cheap, but given sufficient production volume the cost of that will make little difference compared to the savings of making smaller dice. See my calculations a few posts up for a rough estimation.
 
Joined
Feb 3, 2017
Messages
3,481 (1.32/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
Cost and chiplet design overhead is also the function of chiplet size and count.
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Cost and chiplet design overhead is also the function of chiplet size and count.
True. Designing a cutting-edge chip and getting it mass produced does after all cost from hundreds of millions of USD to billions of USD. If a chiplet design allows them to go from, say, small-medium-large-XL monolithic chips to small+medium chiplets in various combinations, that is a massive R&D and manufacturing savings even when accounting for the R&D needed for interposer development, advanced packaging technologies, etc.
 
Joined
Jan 14, 2019
Messages
9,897 (5.12/day)
Location
Midlands, UK
System Name Nebulon-B Mk. 4
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance EXPO DDR5-6000
Video Card(s) Sapphire Pulse Radeon RX 7800 XT
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2, 4 + 8 TB Seagate Barracuda 3.5"
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Windows 10 Pro
Benchmark Scores Cinebench R23 single-core: 1,800, multi-core: 18,000. Superposition 1080p Extreme: 9,900.
Judging by the 20 C difference between edge temp and hotspot temp on my 5700 XT under load, imagine it must be easier to cool a bunch of smaller dies than a single big one.
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Judging by the 20 C difference between edge temp and hotspot temp on my 5700 XT under load, imagine it must be easier to cool a bunch of smaller dies than a single big one.
That depends. Getting a single cold plate to make ideal contact with a collection of individual surfaces will always be more difficult than having it make contact with a single surface. Also, edge/hotspot temperature deltas like that are likely found on all high powered chips, it's just rare for them to have a thermal reporting system that allows users to see both. A smaller die is of course likely to pull less power and might have a smaller distance from edge to hotspot, but the difference isn't likely to be huge. The portion of the chip consuming the power will always be hotter than surrounding regions.
 
Joined
Jul 8, 2019
Messages
169 (0.10/day)
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...
Hm? There are exactly the same amount of chiplets in Ryzen 5000 as Ryzen 3000. They reduced the number of CCXes (Core Complex) per CCD (chiplet, Core Complex Die) from 2 to 1 by doubling the number of cores per CCX, but there are still two CCDs + an IOD in anything with >8 cores and one CCD for anything =<8 cores.
 
Joined
Jul 8, 2019
Messages
169 (0.10/day)
Hm? There are exactly the same amount of chiplets in Ryzen 5000 as Ryzen 3000. They reduced the number of CCXes (Core Complex) per CCD (chiplet, Core Complex Die) from 2 to 1 by doubling the number of cores per CCX, but there are still two CCDs + an IOD in anything with >8 cores and one CCD for anything =<8 cores.

You are right. I thought they've ditched the whole infinity fabric shtick and made unified die. They actually didn't.
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
You are right. I thought they've ditched the whole infinity fabric shtick and made unified die. They actually didn't.
That's only the APUs, AMD aren't going back to monolithic dice for CPUs, likely not ever. The MCM approach allows them low production costs, high yields, great binning flexibility, easy configurability, and a heap of other advantages. And latency is much improved too, even if monolithic chips are still better in that regard.
 
Joined
Apr 24, 2020
Messages
2,563 (1.75/day)
So they decreased the chiplet dependency on new ryzens 5000, and they want to introduce similar thing to GPU? Why? Havent they learned about latency...

Ryzen 5000 I/O die only has 50GBps to each chiplet. GPUs need 500GBps (10x more than CPU bandwidth), but are allowed to have higher latency. The infinity fabric on AMD's CPU needs to be majorly changed to be effective in a GPU architecture.

NVidia's NVLink is closer to a proper chiplet design than anything AMD has made in their GPUs so far. The AMD MI100 Infinity Link system is along the right approach, but only reaches 80GBps. NVidia is pushing 600GBps with the latest generation of NVLink.
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Ryzen 5000 I/O die only has 50GBps to each chiplet. GPUs need 500GBps (10x more than CPU bandwidth), but are allowed to have higher latency. The infinity fabric on AMD's CPU needs to be majorly changed to be effective in a GPU architecture.

NVidia's NVLink is closer to a proper chiplet design than anything AMD has made in their GPUs so far. The AMD MI100 Infinity Link system is along the right approach, but only reaches 80GBps. NVidia is pushing 600GBps with the latest generation of NVLink.
IF can scale out much, much wider than its implementation in Ryzen though, so aggregate bandwidth shouldn't be a problem. But still, there's no mention of IF in the patent, so they might be using some other bus for this (or just keeping the patent intentionally vague, obviously).
 
Joined
Feb 3, 2017
Messages
3,481 (1.32/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
IF can scale out much, much wider than its implementation in Ryzen though, so aggregate bandwidth shouldn't be a problem. But still, there's no mention of IF in the patent, so they might be using some other bus for this (or just keeping the patent intentionally vague, obviously).
Sure IF can scale. The problem isn't scalability, it is probably power at large bandwidth numbers :)
This is not unique to AMD either, Nvidia has the same problem with NVLink.
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Sure IF can scale. The problem isn't scalability, it is probably power at large bandwidth numbers :)
This is not unique to AMD either, Nvidia has the same problem with NVLink.
Oh, absolutely. But given that AMD can handle a ton of IF links over relatively long distances through a PCB substrate in TR with about 70W of power for those links + the IOD (including 8 memory controllers and a heap of PCIe), implementing a wide link setup through a silicon interposer for GPUs ought to be manageable in terms of power if we consider a total package power envelope of 250-300W.
 
Joined
Mar 21, 2016
Messages
2,198 (0.74/day)
Chimp innovation at it's finest...so advanced you'd swear it's bananas! This chimp copies that chimp who makes those chimps go chimpanzee OMG bananas over it!!!
 
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
so, amd just copies every step Intel has already done, or planned to do. yawn
Ah, yes, because nobody has talked about MCM GPUs before Intel ...

My guess, AMD, Nvidia and Intel have all been at work on this tech for 3+ years.
 
Joined
Mar 10, 2010
Messages
11,878 (2.30/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
so, amd just copies every step Intel has already done, or planned to do. yawn
In what way, AMD are laying out a path to their version of multi die GPU and Intel sure as shit were not doing multi die GPU before AMD.
Pontevechio was for servers not consumer's.
Interesting actual angle, from my reading you have master and slave dies, massive bandwidth but essentially one tile to rule them all and an io die in the interposer.

First GPU does all the scheduling, the first virtex pass on math's Then hand's out work, there may be a efficiency hit on the first designs, of few tiles but if it scales it could serve well as a forward path and be really effective across 8 or more tiles.
 
Last edited:
Joined
May 2, 2017
Messages
7,762 (3.04/day)
Location
Back in Norway
System Name Hotbox
Processor AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G603
Keyboard Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
This solution is based on a 12 inch wafers and in the future the industry will move to 18 inch wafers, which means higher utilization of the fab and better pricing per wafer and eventually better prices to the end user. Basically, much more dies per wafer. This die per wafer calculator show the various options per wafer size: https://anysilicon.com/die-per-wafer-formula-free-calculators/
Hasn't that been "in the future" for like two decades now, with no real progress being made? Considering the massive fab expansions in the works currently (planned to be ready for mass production between this year and 2025-27), all of which are 300mm, it's going to be a long, long time until 450mm wafers take over high end fabs.
 
Top