AMD Files Patent for Chiplet Machine Learning Accelerator to be Paired With GPU, Cache Chiplets

Raevenlord · Jan 29, 2021

AMD has filed a patent whereby they describe a MLA (Machine Learning Accelerator) chiplet design that can then be paired with a GPU unit (such as RDNA 3) and a cache unit (likely a GPU-excised version of AMD's Infinity Cache design debuted with RDNA 2) to create what AMD is calling an "APD" (Accelerated Processing Device). The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication. This would enable capabilities not unlike those available through NVIDIA's Tensor cores.

This could give AMD a modular way to add machine-learning capabilities to several of their designs through the inclusion of such a chiplet, and might be AMD's way of achieving hardware acceleration of a DLSS-like feature. This would avoid the shortcomings associated with implementing it in the GPU package itself - an increase in overall die area, with thus increased cost and reduced yields, while at the same time enabling AMD to deploy it in other products other than GPU packages. The patent describes the possibility of different manufacturing technologies being employed in the chiplet-based design - harkening back to the I/O modules in Ryzen CPUs, manufactured via a 12 nm process, and not the 7 nm one used for the core chiplets. The patent also describes acceleration of cache-requests from the GPU die to the cache chiplet, and on-the-fly usage of it as actual cache, or as directly-addressable memory.

View at TechPowerUp Main Site

FreedomEclipse · Jan 29, 2021

Cache me outside, howboudah.

TumbleGeorge · Jan 29, 2021

LoL. Artificial Intelect in GPU? Is possible to talk use street language with next gen graphic cards?

Vya Domus · Jan 29, 2021

Interesting, clearly GPUs are being left in the dust by dedicated accelerators for certain computations. But dedicated accelerators are inflexible while GPUs are fully programable and can run just about anything and they both have a major problem, memory bandwidth. This is a nice way of solving all the problems.

OGoc · Jan 29, 2021

With the purchase of Xilinx I suspect AMD will start reducing the number of hardware accelerated functions like encode and decode, machine learning, and will replace them with an FPGA section. Perfect for chiplet implementation.

crimsontape · Jan 29, 2021

I'm loving this idea. I've been thinking about this for a while now. Add a little machine learning hardware to a APU SoC or a GPU, and things could be really interesting for both the general use, gaming and GPU-computing markets. Like OGoc said, match this with FPGA tech, and the potential becomes pretty obvious.

dicktracy · Jan 29, 2021

Just give it up AMD

ZoneDymo · Jan 29, 2021

dicktracy said:
Just give it up AMD

it?

as in....

zlobby · Jan 29, 2021

FreedomEclipse said:
Cache me outside, howboudah.

Top kek!

TumbleGeorge said:
LoL. Artificial Intelect in GPU? Is possible to talk use street language with next gen graphic cards?

Like 'Oi, bruv! Imma shagg some b*itches in GTA5, innit?'

Let's hope this AI doesn't pick the street lingo of the LoL community!

Paganstomp · Jan 29, 2021

Chicklets Machine...

dragontamer5788 · Jan 29, 2021

OGoc said:
With the purchase of Xilinx I suspect AMD will start reducing the number of hardware accelerated functions like encode and decode, machine learning, and will replace them with an FPGA section. Perfect for chiplet implementation.

Those Xilinx FPGAs are VLIW SIMD cores, probably more similarities to a GPU than you might think.

Yeah, there are some LUTs on those FPGAs, but the actual computational girth comes from these babies: https://www.xilinx.com/support/documentation/white_papers/wp506-ai-engine.pdf

DonKnotts · Jan 29, 2021

I'm sorry, I just don't want them to add yet another reason to raise the prices of these already far too expensive graphics cards. I have 0 excitement for this.

TheoneandonlyMrK · Jan 29, 2021

I would love to see AMD and intel secret sauce chiplets of the future lists, this isn't unexpected what arm, apple and many more do with specific hardware X86 will leverage more heavy-hitting but adaptable circuitry.
would be nice if Amd got one Api ish too.

Wirko · Jan 30, 2021

Paganstomp said:
View attachment 186135
Chicklets Machine...

Oh, so that's how a poor man's ASML machine looks like. Sweet.

Aquinus · Jan 30, 2021

You get a chiplet, and you get a chiplet, and you get a chiplet! Chiplets for everyone! WOO!

In all seriousness, this just sounds like AMD doing more of the same thing they've been working towards for years. You have an I/O chiplet, and a CPU chiplet, and soon we'll have GPU chiplets and AI accelerator chiplets. We've already seen that this can scale well, so this should be an exciting prospect for future products. An APU with one of all of the above would be one hell of a chip.

1d10t · Jan 30, 2021

Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue". With another bump in cache size and leaning towards agnostic function, I can see wider adoption not just RT in gaming.

dragontamer5788 · Jan 30, 2021

1d10t said:
Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue". With another bump in cache size and leaning towards agnostic function, I can see wider adoption not just RT in gaming.

AMD has had so many patents over the years that I've basically stopped paying attention to patents in general.

Remember "Super ALUs" ?? Yeah, they're not around. AMD decided against them for whatever reason. Maybe it wasn't as good as other techniques they got, or maybe they ran some simulations and it could have made things worse. Just wait for the whitepapers to come out.

Vayra86 · Jan 30, 2021

The next AMD meme

MOAR CHIPLUTZ

Vya Domus · Jan 30, 2021

1d10t said:
Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue".

This has zero to do with RT.

Nkd · Jan 30, 2021

DonKnotts said:
I'm sorry, I just don't want them to add yet another reason to raise the prices of these already far too expensive graphics cards. I have 0 excitement for this.

what are you talking about? this actually makes shit cheaper because you are not making one big fat die like Nvidia, because eventually you are going to need chiplets because you are not going to keep shrinking forever. AMD is just ahead of everyone and have been working towards this for years. You have nothing to worry about lol.

dragontamer5788 said:
AMD has had so many patents over the years that I've basically stopped paying attention to patents in general.

Remember "Super ALUs" ?? Yeah, they're not around. AMD decided against them for whatever reason. Maybe it wasn't as good as other techniques they got, or maybe they ran some simulations and it could have made things worse. Just wait for the whitepapers to come out.

Clearly this is totally different and fits exactly in to their future gameplan. Not all Patents are the same, some do have big implications lol.

TheoneandonlyMrK · Jan 30, 2021

Nkd said:
what are you talking about? this actually makes shit cheaper because you are not making one big fat die like Nvidia, because eventually you are going to need chiplets because you are not going to keep shrinking forever. AMD is just ahead of everyone and have been working towards this for years. You have nothing to worry about lol.

Clearly this is totally different and fits exactly in to their future gameplan. Not all Patents are the same, some do have big implications lol.

I agree but think The point of chiplets is that they are one of few ways to make cutting edge nodes financially viable, as time goes by this is only going to escalate, by 2nm and euv processing as well as the increase in mask costs are putting the cost of a complete wafer up considerably and advanced packaging technology isn't cheaper packaging technology, AMD were ahead of the game but emib does rule some of that gain out, interesting times what with others still stuck on monolithic designs, apple for example.

daehxxiD · Jan 30, 2021

Ohhh... This sounds quite promising! Shame most machine learning frameworks are written for cuda. Hope if this comes out, bigger frameworks like Tensorflow or PyTorch make use of it.

Patriot · Jan 31, 2021

daehxxiD said:
Ohhh... This sounds quite promising! Shame most machine learning frameworks are written for cuda. Hope if this comes out, bigger frameworks like Tensorflow or PyTorch make use of it.

Tensorflow was made by google for their own TPU hardware. It works on anyone's stuff.
https://github.com/ROCmSoftwarePlatform/tensorflow-upstream Been supported via ROCm for a couple of years now.
PyTorch has been supported since ROCm 3.7, 4.01 is current. https://github.com/aieater/rocm_pytorch_informations

Nvidia's stuff is definitely a bit more plug and play, and AMD's engineering support is just now ramping, they have a long way to catch up.

There are a lot of interesting accelerators on the market now, its a fun time.

1d10t · Jan 31, 2021

Vya Domus said:
This has zero to do with RT.

Bummer, I thought matrix multiplication sound like complex version of Fused Multiply Add

The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication

voltage · Jan 31, 2021

years ago Intel created the first chiplets, why didn't they patent the idea then?? maybe they didn't do so because of that previous do nothing ceo they had? (I am referring to the ceo who was getting his noddle wet with an employee)

System Name	The Ryzening
Processor	AMD Ryzen 9 5900X
Motherboard	MSI X570 MAG TOMAHAWK
Cooling	Lian Li Galahad 360mm AIO
Memory	32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s)	Gigabyte RTX 3070 Ti
Storage	Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s)	Acer Nitro VG270UP (1440p 144 Hz IPS)
Case	Lian Li O11DX Dynamic White
Audio Device(s)	iFi Audio Zen DAC
Power Supply	Seasonic Focus+ 750 W
Mouse	Cooler Master Masterkeys Lite L
Keyboard	Cooler Master Masterkeys Lite L
Software	Windows 10 x64

System Name	WorkInProgress
Processor	AMD 7800X3D
Motherboard	MSI X670E GAMING PLUS
Cooling	Thermalright AM5 Contact Frame + Phantom Spirit 120SE
Memory	2x32GB G.Skill Trident Z5 NEO DDR5 6000 CL32
Video Card(s)	Gainward RTX 4070Ti Phantom Reunion (The54thvoid Edition)
Storage	WD SN770 1TB (Boot)\|1x WD SN850X 8TB (Gaming)\| 2x2TB WD SN770\| 2x2TB+2x4TB Crucial BX500
Display(s)	LG GP850-B
Case	Corsair 760T (White) {1xCorsair ML120 Pro\|5xML140 Pro}
Audio Device(s)	Yamaha RX-V573\|Speakers: JBL Control One\|Auna 300-CN\|Wharfedale Diamond SW150
Power Supply	Seasonic Focus GX-850 80+ GOLD
Mouse	Logitech G502 X
Keyboard	Cherry G80-3000N (TKL)
Software	Windows 11 Home
Benchmark Scores	ლ(ಠ益ಠ)ლ

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C

System Name	La Machina
Processor	AMD Ryzen 2700
Motherboard	ASUS B450 TUF mATX
Cooling	EVO 212
Memory	Corsair 3200MHz CL16
Video Card(s)	RX 560
Storage	Some SSD here, some old spinning stuff there
Display(s)	4k Samsung TV and an Asus Pro Art 231
Case	Some microatx Antec
Audio Device(s)	ASUS Essence STX
Power Supply	Seasonic 600W maybe?

System Name	Cyberline
Processor	Intel Core i7 2600k -> 12600k
Motherboard	Asus P8P67 LE Rev 3.0 -> Gigabyte Z690 Auros Elite DDR4
Cooling	Tuniq Tower 120 -> Custom Watercoolingloop
Memory	Corsair (4x2) 8gb 1600mhz -> Crucial (8x2) 16gb 3600mhz
Video Card(s)	AMD RX480 -> RX7800XT
Storage	Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb -> 2tb MVMe SSD
Display(s)	Philips 32inch LPF5605H (television) -> Dell S3220DGF
Case	antec 600 -> Thermaltake Tenor HTCP case
Audio Device(s)	Focusrite 2i4 (USB)
Power Supply	Seasonic 620watt 80+ Platinum
Mouse	Elecom EX-G
Keyboard	Rapoo V700
Software	Windows 10 Pro 64bit

AMD Files Patent for Chiplet Machine Learning Accelerator to be Paired With GPU, Cache Chiplets

Raevenlord

News Editor

FreedomEclipse

~Technological Technocrat~

TumbleGeorge

Vya Domus

OGoc

crimsontape

dicktracy

ZoneDymo

zlobby

Paganstomp

dragontamer5788

DonKnotts

TheoneandonlyMrK

Wirko

Aquinus

Resident Wat-man

1d10t

dragontamer5788

Vayra86

Vya Domus

Nkd

TheoneandonlyMrK

daehxxiD

Patriot

1d10t

voltage

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, AirPods Max
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.5

System Name	Poor Man's PC
Processor	Ryzen 7 7700
Motherboard	MSI B650M Mortar WiFi
Cooling	FSP NP-5 Black
Memory	32GB GSkill Flare X5 DDR5 6000Mhz
Video Card(s)	XFX Merc 310 Radeon RX 7900 XT
Storage	XPG Gammix S70 Blade 2TB + 8 TB WD Ultrastar DC HC320
Display(s)	Xiaomi G Pro 27i MiniLED
Case	Asus A21 Case
Audio Device(s)	MPow Air Wireless + Mi Soundbar
Power Supply	Enermax Revolution DF 650W Gold
Mouse	Logitech MX Anywhere 3
Keyboard	Logitech Pro X + Kailh box heavy pale blue switch + Durock stabilizers
VR HMD	Meta Quest 2
Benchmark Scores	Who need bench when everything already fast?

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

System Name	Suteki Ryzen
Processor	AMD Ryzen 3900 @ Stock
Motherboard	MSI Tomahawk B450 MAX
Cooling	Stock Box Cooler
Memory	32gb ( 2x16gb) 3600mhz DDR4 G.Skill 16-19-19-39-58
Video Card(s)	MSI GeForce RTX 2060 6GB Ram
Storage	256gb nvme SSD, 250GB SATA3 SSD, 480GB USB3 SSD, 750GB SSHD, 3TB WD Red HDD
Display(s)	Asus VE220 / Epson TW-3000 1080p projector
Case	Raijintek Thetis Black
Audio Device(s)	Asus Xonar D1 PCI on a PCIe-to-PCI Adapter
Power Supply	EVGA 550G2
Mouse	Razer DeathAdder 2013
Software	Windows 10 Pro

System Name	[H]arbringer
Processor	4x 61XX ES @3.5Ghz (48cores)
Motherboard	SM GL
Cooling	3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory	16x gskill DDR3 1600 cas6 2gb
Video Card(s)	blah bigadv folder no gfx needed
Storage	32GB Sammy SSD
Display(s)	headless
Case	Xigmatek Elysium (whats left of it)
Audio Device(s)	yawn
Power Supply	Antec 1200w HCP
Software	Ubuntu 10.10
Benchmark Scores	http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww