• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Pat Gelsinger Says 3D Stacked Cache Tech Coming to Intel

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,164 (7.57/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Intel CEO Pat Gelsinger, in the Q&A session of InnovatiON 2023 Day 1, confirmed that the company is developing 3D-stacked cache technology for its processors. The technology involves expanding the on-die last-level cache (L3 cache) of a processor with an additional SRAM die physically stacked on top, and bonded with the cache's high-bandwidth data fabric. The stacked cache operates at the same speed as the on-die cache, and so the combined cache size is visible to software as a single contiguous addressable block of cache memory.

AMD has used 3D-stacked cache to good effect on its processors. On client processors such as the Ryzen X3D series, the cache provides significant gaming performance uplifts as the larger L3 cache makes more of the game's rendering data immediately accessible to the CPU cores; while on server processors such as EPYC "Milan-X" and "Genoa-X," the added cache provides significant uplifts to memory intensive compute workloads. Intel's approach to 3D-stacked cache will be different at the hardware level compared to AMD's, Gelsinger stated in his response. AMD's tech has been collaboratively developed with TSMC, and hinges on a TSMC-made SoIC packaging tech that facilitates high-density die-to-die wiring between the CCD and cache chiplet. Intel uses its own fabs for processor dies, and will have to use its own IP.



"When you reference V-Cache, you're talking about a very specific technology that TSMC does with some of its customers as well. Obviously, we're doing that differently in our composition, right? And that particular type of technology isn't something that's part of Meteor Lake, but in our roadmap, you're seeing the idea of 3D silicon where we'll have cache on one die, and we'll have CPU compute on the stacked die on top of it, and obviously using EMIB that Foveros we'll be able to compose different capabilities," Gelsinger said.

"We feel very good that we have advanced capabilities for next-generation memory architectures, advantages for 3D stacking, for both little die, as well as for very big packages for AI and high-performance servers as well. So we have a full breadth of those technologies. We'll be using those for our products, as well as presenting it to the Foundry (IFS) customers as well," he added.

Intel recently provided an architecture deep-dive into its upcoming "Meteor Lake" client processor, in which its Foveros packaging tech and tile-to-tile interconnects allow the various tiles (chiplets) to work like a cohesive silicon. In particular, Intel appears to have solved the latency issues of having a the iGPU, CPU cores, and memory controllers on separate tiles.

View at TechPowerUp Main Site | Source
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,051 (4.65/day)
Location
Kepler-186f
if you can't beat'em...

what is the point of innovation these days when people just copy you lol
 
Joined
May 19, 2009
Messages
1,858 (0.33/day)
Location
Latvia
System Name Personal \\ Work - HP EliteBook 840 G6
Processor 7700X \\ i7-8565U
Motherboard Asrock X670E PG Lightning
Cooling Noctua DH-15
Memory G.SKILL Trident Z5 RGB Black 32GB 6000MHz CL36 \\ 16GB DDR4-2400
Video Card(s) ASUS RoG Strix 1070 Ti \\ Intel UHD Graphics 620
Storage 2x KC3000 2TB, Samsung 970 EVO 512GB \\ OEM 256GB NVMe SSD
Display(s) BenQ XL2411Z \\ FullHD + 2x HP Z24i external screens via docking station
Case Fractal Design Define Arc Midi R2 with window
Audio Device(s) Realtek ALC1150 with Logitech Z533
Power Supply Corsair AX860i
Mouse Logitech G502
Keyboard Corsair K55 RGB PRO
Software Windows 11 \\ Windows 10
if you can't beat'em...

what is the point of innovation these days when people just copy you lol

So why someone should not use what works great? Why reinvent the wheel repeatedly?
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,051 (4.65/day)
Location
Kepler-186f
Well smarty pants 3D stacked cache is TSMC's invention not AMD's.

Intel's approach is different BTW, and who cares, if it improves the product and improves efficiency like in AMD's case, bring it on.

I didn't know this, cool, surprised its not going to be a part of Meteor Lake, the writing was on the wall Intel...

So why someone should not use what works great? Why reinvent the wheel repeatedly?

cause of money, this is why Apple and Nvidia are walled gardens so no one can use their inventions. As much as I love AMD, FSR3 Frame Gen is never going to match the dedicated tensor core DLSS 3.5 and DLSS4 frame gen in sheer quality, just no way no how. That's how you make bank.

TSMC is I hope making money off its 3dcache invention, if not, then they are a foolish company.
 
Joined
Mar 7, 2011
Messages
4,501 (0.90/day)
Well smarty pants 3D stacked cache is TSMC's invention not AMD's.

Intel's approach is different BTW, and who cares, if it improves the product and improves efficiency like in AMD's case, bring it on.
U.S. patent application number 17/129739 was filed with the patent office on 2021-12-02 for stacked dies for machine learning accelerator. This patent application is currently assigned to Advanced Micro Devices, Inc.. The applicant listed for this patent is Advanced Micro Devices, Inc.. So its clearly owned by AMD just manufacturing side being done by TSMC.

Also one of the person on that patent is my classmate and currently he seems to be employed by Intel.

There are older(now expired) patents from T.I. and Philips as well:
 

wolf

Better Than Native
Joined
May 7, 2007
Messages
8,147 (1.28/day)
System Name MightyX
Processor Ryzen 5800X3D
Motherboard Gigabyte X570 I Aorus Pro WiFi
Cooling Scythe Fuma 2
Memory 32GB DDR4 3600 CL16
Video Card(s) Asus TUF RTX3080 Deshrouded
Storage WD Black SN850X 2TB
Display(s) LG 42C2 4K OLED
Case Coolermaster NR200P
Audio Device(s) LG SN5Y / Focal Clear
Power Supply Corsair SF750 Platinum
Mouse Corsair Dark Core RBG Pro SE
Keyboard Glorious GMMK Compact w/pudding
VR HMD Meta Quest 3
Software case populated with Artic P12's
Benchmark Scores 4k120 OLED Gsync bliss
Nice, perhaps when it comes time to replace my 5800X3D based system, both gaming will have v-cache gaming focussed chips to chose from.
 
Joined
Jun 14, 2020
Messages
3,275 (2.05/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
if you can't beat'em...

what is the point of innovation these days when people just copy you lol
Just for clarification, 3d cache is 100% TSMC's, nothing to do with AMD. They offer that to every single one of their clients

EG1. Ok, someone beat me to it.

U.S. patent application number 17/129739 was filed with the patent office on 2021-12-02 for stacked dies for machine learning accelerator. This patent application is currently assigned to Advanced Micro Devices, Inc.. The applicant listed for this patent is Advanced Micro Devices, Inc.. So its clearly owned by AMD just manufacturing side being done by TSMC.

Also one of the person on that patent is my classmate and currently he seems to be employed by Intel.

There are older(now expired) patents from T.I. and Philips as well:
No, just no.

TSMC even owns the name 3d.

 

TheLostSwede

News Editor
Joined
Nov 11, 2004
Messages
17,486 (2.40/day)
Location
Sweden
System Name Overlord Mk MLI
Processor AMD Ryzen 7 7800X3D
Motherboard Gigabyte X670E Aorus Master
Cooling Noctua NH-D15 SE with offsets
Memory 32GB Team T-Create Expert DDR5 6000 MHz @ CL30-34-34-68
Video Card(s) Gainward GeForce RTX 4080 Phantom GS
Storage 1TB Solidigm P44 Pro, 2 TB Corsair MP600 Pro, 2TB Kingston KC3000
Display(s) Acer XV272K LVbmiipruzx 4K@160Hz
Case Fractal Design Torrent Compact
Audio Device(s) Corsair Virtuoso SE
Power Supply be quiet! Pure Power 12 M 850 W
Mouse Logitech G502 Lightspeed
Keyboard Corsair K70 Max
Software Windows 10 Pro
Benchmark Scores https://valid.x86.fr/yfsd9w
Just for clarification, 3d cache is 100% TSMC's, nothing to do with AMD. They offer that to every single one of their clients

EG1. Ok, someone beat me to it.


No, just no.

TSMC even owns the name 3d.

Reading comprehension is a thing. Nowhere on TSMC's site does it say 3D Cache.
TSMC owns the packaging technology that enabled AMD to create 3D V-Cache, but AMD owns the actual 3D V-Cache design.
The two clearly collaborated to make it work, based on TSMC's packaging technology, as without it, AMD couldn't have made it.
The two are not the same though and TSMC doesn't actually make any "products" as such, so why would they be making cache wafers on their own and then sell it to someone who might want it? It makes zero sense, as that's not how TSMC operates. Then they would also be making Arm based chips or whatever and selling it to whoever wants them and they do not.
TSMC is a foundry, they design tech that their customers can leverage, they don't design chips or parts of chips.
 
Joined
Jun 14, 2020
Messages
3,275 (2.05/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Reading comprehension is a thing. Nowhere on TSMC's site does it say 3D Cache.
TSMC owns the packaging technology that enabled AMD to create 3D V-Cache, but AMD owns the actual 3D V-Cache design.
The two clearly collaborated to make it work, based on TSMC's packaging technology, as without it, AMD couldn't have made it.
The two are not the same though and TSMC doesn't actually make any "products" as such, so why would they be making cache wafers on their own and then sell it to someone who might want it? It makes zero sense, as that's not how TSMC operates. Then they would also be making Arm based chips or whatever and selling it to whoever wants them and they do not.
TSMC is a foundry, they design tech that their customers can leverage, they don't design chips or parts of chips.
I'm sorry, I never meant to say that tsmc designs the cache. I'm saying the whole 3d stacked tech is theirs. Any company can design a 3d stacked chip and send it to tsmc for production.
 
Joined
Feb 11, 2020
Messages
239 (0.14/day)
The best part about the stacked cache in the Ryzen 3D parts is the power reduction achieved.
 
Joined
May 3, 2018
Messages
2,881 (1.21/day)
I'm sorry, I never meant to say that tsmc designs the cache. I'm saying the whole 3d stacked tech is theirs. Any company can design a 3d stacked chip and send it to tsmc for production.
AMD's v-cache is their simpler embodiment of TSMC's 3D fabric technology called SoIC . That's not up for debate. The 3D stacking is TSMC's invention pure and simple.
Anandtech did a deep dive into this in 2021.
 
Joined
Oct 1, 2006
Messages
4,930 (0.75/day)
Location
Hong Kong
Processor Core i7-12700k
Motherboard Z690 Aero G D4
Cooling Custom loop water, 3x 420 Rad
Video Card(s) RX 7900 XTX Phantom Gaming
Storage Plextor M10P 2TB
Display(s) InnoCN 27M2V
Case Thermaltake Level 20 XT
Audio Device(s) Soundblaster AE-5 Plus
Power Supply FSP Aurum PT 1200W
Software Windows 11 Pro 64-bit
AMD's v-cache is their simpler embodiment of TSMC's 3D fabric technology called SoIC . That's not up for debate. The 3D stacking is TSMC's invention pure and simple.
Anandtech did a deep dive into this in 2021.
You make it sounds like everything AMD did was trivial.
If it is so simple then Intel would have achieve this long ago, not wait for a couple gen and let AMD "steal" their thunder.
Intel is making a big deal out of this, I would assume there are some major engineering hurdle that they overcame.
 
Joined
Apr 18, 2019
Messages
2,325 (1.15/day)
Location
Olympia, WA
System Name Sleepy Painter
Processor AMD Ryzen 5 3600
Motherboard Asus TuF Gaming X570-PLUS/WIFI
Cooling FSP Windale 6 - Passive
Memory 2x16GB F4-3600C16-16GVKC @ 16-19-21-36-58-1T
Video Card(s) MSI RX580 8GB
Storage 2x Samsung PM963 960GB nVME RAID0, Crucial BX500 1TB SATA, WD Blue 3D 2TB SATA
Display(s) Microboard 32" Curved 1080P 144hz VA w/ Freesync
Case NZXT Gamma Classic Black
Audio Device(s) Asus Xonar D1
Power Supply Rosewill 1KW on 240V@60hz
Mouse Logitech MX518 Legend
Keyboard Red Dragon K552
Software Windows 10 Enterprise 2019 LTSC 1809 17763.1757


Both SK Hynix and Samsung believe they will be able to achieve a "100% yield" with HBM4 when they begin to manufacture it. Only time will tell if the reports hold water, so take the news with a grain of salt.


100% yield, impressive claims they are so confident :D

Clearly, interposer design and manufacturing has advanced greatly, and HBM is (potentially) seeing unfathomably-good yields.

Now, with All the Big Players getting in on 3D stacking Cache/Mem, I'm wondering if 'consumers' will see HBM on CPUs and GPUs(again)


My bet:
The Generation after the immediately-inbound generation, we'll see HBM products in the consumer marketspace.
Not just Intel/AMD/nVidia, either; I'm thinking HBM-equipped SoCs for mobile devices, and considerations for future handheld PCs and Consoles.
 
Joined
Oct 1, 2006
Messages
4,930 (0.75/day)
Location
Hong Kong
Processor Core i7-12700k
Motherboard Z690 Aero G D4
Cooling Custom loop water, 3x 420 Rad
Video Card(s) RX 7900 XTX Phantom Gaming
Storage Plextor M10P 2TB
Display(s) InnoCN 27M2V
Case Thermaltake Level 20 XT
Audio Device(s) Soundblaster AE-5 Plus
Power Supply FSP Aurum PT 1200W
Software Windows 11 Pro 64-bit



Clearly, interposer design and manufacturing has advanced greatly, and HBM is (potentially) seeing unfathomably-good yields.

Now, with All the Big Players getting in on 3D stacking Cache/Mem, I'm wondering if 'consumers' will see HBM on CPUs and GPUs(again)


My bet:
The Generation after the immediately-inbound generation, we'll see HBM products in the consumer marketspace.
Not just Intel/AMD/nVidia, either; I'm thinking HBM-equipped SoCs for mobile devices, and considerations for future handheld PCs and Consoles.
The issue with HBM on consumer hardware is not a technical one. It is the cost and supply constrains.
HBM demand is higher than ever right now in Accelerators cards espeically with the whole AI craze going on.
 
Joined
Apr 18, 2019
Messages
2,325 (1.15/day)
Location
Olympia, WA
System Name Sleepy Painter
Processor AMD Ryzen 5 3600
Motherboard Asus TuF Gaming X570-PLUS/WIFI
Cooling FSP Windale 6 - Passive
Memory 2x16GB F4-3600C16-16GVKC @ 16-19-21-36-58-1T
Video Card(s) MSI RX580 8GB
Storage 2x Samsung PM963 960GB nVME RAID0, Crucial BX500 1TB SATA, WD Blue 3D 2TB SATA
Display(s) Microboard 32" Curved 1080P 144hz VA w/ Freesync
Case NZXT Gamma Classic Black
Audio Device(s) Asus Xonar D1
Power Supply Rosewill 1KW on 240V@60hz
Mouse Logitech MX518 Legend
Keyboard Red Dragon K552
Software Windows 10 Enterprise 2019 LTSC 1809 17763.1757
The issue with HBM on consumer hardware is not a technical one. It is the cost and supply constrains.
HBM demand is higher than ever right now in Accelerators cards espeically with the whole AI craze going on.
That's why I brought up what I did. That *has* been the issue.
If yields are that high, and interposers are popping up across products we'd never thought of; meaning, better manufacturing/assembly yields...
At least historically, the consumer segment (eventually) gets the scraps from well-developed and high-yielding new technologies.


However, you do have a point.
If AI/MI 'market demand' is unprecedentedly high (and it sustains that demand); you're right, we will not see 'consumer-facing' products.

I'd like to think even then, if those boisterous yields are accurate, it'll just take a couple generations to see the "trickle-down".

Edit:

If DRAM(HBM) can 'scale better' as process nodes shrink, HBM's performance may prove sufficient to overtake SRAM. see @user556 's reply, correcting my misunderstanding.
Or, at least become a hybridized 3D/MCM affair; with SRAM(of larger nodes) and/or HBM stacked and in interposer-connected modules.

Here's a great presentation from AMD, showing conceptualization and implementation of 3D stacked cache


My mspaint kit-bash of TechSpot's HBM4 article, and AMD's 3D Cache Presentation (not to scale :laugh:)
Hybrid3D_HBM-MCM.png

NVM. Hardwaretimes' article goes over it better.
 
Last edited:
Joined
Apr 18, 2019
Messages
2,325 (1.15/day)
Location
Olympia, WA
System Name Sleepy Painter
Processor AMD Ryzen 5 3600
Motherboard Asus TuF Gaming X570-PLUS/WIFI
Cooling FSP Windale 6 - Passive
Memory 2x16GB F4-3600C16-16GVKC @ 16-19-21-36-58-1T
Video Card(s) MSI RX580 8GB
Storage 2x Samsung PM963 960GB nVME RAID0, Crucial BX500 1TB SATA, WD Blue 3D 2TB SATA
Display(s) Microboard 32" Curved 1080P 144hz VA w/ Freesync
Case NZXT Gamma Classic Black
Audio Device(s) Asus Xonar D1
Power Supply Rosewill 1KW on 240V@60hz
Mouse Logitech MX518 Legend
Keyboard Red Dragon K552
Software Windows 10 Enterprise 2019 LTSC 1809 17763.1757
DRAM doesn't scale as well as SRAM. Both are hitting the wall right now but the densest DRAMs are a few generations behind the cutting edge nodes that SRAM is integral to.
So, basically. Everything is going 'stacked' or into modules; out of necessity.

Thanks for the input :cool:

Are there already implementations for mixed node Multiple Patterning monolithic die chips?
 
Joined
Jun 14, 2020
Messages
3,275 (2.05/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
You make it sounds like everything AMD did was trivial.
If it is so simple then Intel would have achieve this long ago, not wait for a couple gen and let AMD "steal" their thunder.
Intel is making a big deal out of this, I would assume there are some major engineering hurdle that they overcame.
You confuse the fabs with the design. Intel are losing to tsmc, they are not able to fabricate the 3d. And neither is amd.
 
Joined
Mar 7, 2010
Messages
984 (0.18/day)
Location
Michigan
System Name Daves
Processor AMD Ryzen 3900x
Motherboard AsRock X570 Taichi
Cooling Enermax LIQMAX III 360
Memory 32 GiG Team Group B Die 3600
Video Card(s) Powercolor 5700 xt Red Devil
Storage Crucial MX 500 SSD and Intel P660 NVME 2TB for games
Display(s) Acer 144htz 27in. 2560x1440
Case Phanteks P600S
Audio Device(s) N/A
Power Supply Corsair RM 750
Mouse EVGA
Keyboard Corsair Strafe
Software Windows 10 Pro
Wasn't Pat the one said that their new CPUs would be the end of AMD? Or am I thinking of someone else?
 
Joined
Nov 4, 2005
Messages
11,960 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Joined
Apr 12, 2013
Messages
7,473 (1.77/day)
Well smarty pants 3D stacked cache is TSMC's invention not AMD's.

Intel's approach is different BTW, and who cares, if it improves the product and improves efficiency like in AMD's case, bring it on.
Wrong, as shown by others. Anyway EMIB is supposedly better at least today than what AMD had a decade or even half a decade back, especially with HBM on Vega. It's essentially the same you're "enhancing" cache instead of HBM, GDDRxx or whatever for main memory. Though in the future it is likely they're go a similar route like Apple with stacked/soldered LPDDRxx for higher bandwidth, less latency among other things.
 
Joined
Jun 18, 2021
Messages
2,534 (2.06/day)
You confuse the fabs with the design. Intel are losing to tsmc, they are not able to fabricate the 3d. And neither is amd.

Amd is not able to fabricate anything since they spun off their fab business into what is now Global Foundries ;)

Intel has their own stacked solution has well - Foveros - they simply weren't able to do anything particularly useful with it until now. The only product I know of using it was the Lakefield CPU that was a comercial failure, it used Foveros to stack the compute die on top of the IO (akin to what AMD does with infinty fabric instead).


I don't get all the grandstanding trying to down play AMD's use of TSMC's stacked packaging for 3d cache, yes AMD didn't come up with the entire thing, but what's the point of 3d stacking if no one comes up with interesting ways to use it? For anyone arguing it's "just a TSMC thing they offer to anyone", where's the qualcomm soc with stacked cache or stacked whatever? Or Apple's ? Or Mediatek's? Or Nvidia's?
 
Joined
May 3, 2018
Messages
2,881 (1.21/day)
You make it sounds like everything AMD did was trivial.
If it is so simple then Intel would have achieve this long ago, not wait for a couple gen and let AMD "steal" their thunder.
Intel is making a big deal out of this, I would assume there are some major engineering hurdle that they overcame.
Sure, let's go with that BS. AMD's v-cache is a far simpler 2 layer implementation of what TSMC developed that allowed for up to 12 layers. Never said it was trivial.

Wrong, as shown by others. Anyway EMIB is supposedly better at least today than what AMD had a decade or even half a decade back, especially with HBM on Vega. It's essentially the same you're "enhancing" cache instead of HBM, GDDRxx or whatever for main memory. Though in the future it is likely they're go a similar route like Apple with stacked/soldered LPDDRxx for higher bandwidth, less latency among other things.
Sure believe what you want but nothing I said is wrong.
 
Joined
Dec 12, 2020
Messages
1,755 (1.24/day)
So will this be similar to the last level cache Intel had w/the infamous i-5775c but larger?

Why did Intel ever give up on further development of the i-5775c?
 
Top