Thursday, May 3rd 2018

TSMC to Bring 3D Stacked Wafers to Complex Silicon Designs, Such as GPUs

TSMC is close to adapt 3D stacked silicon wafers to complex silicon designs, such as graphics processors, using its new proprietary Wafer-on-Wafer (WoW) Advanced Packaging technology, which will be introduced with its 7 nm+ and 5 nm nodes. 3D stacked silicon fabrication is currently only implemented on "less complex" silicon designs, such as NAND flash, which don't run anywhere near as hot as complex designs ASIC designs, such as GPUs or CPUs. In its current form, TSMC achieved 2-layer stacks, in which two silicon layers that are "mirror images" of each other (for perfect alignment), sandwich bonding layers, through which pins for the upper layer pass through.

The bonding of the two layers is where the bulk of TSMC's innovations and "secret sauces" lie. For 3D NAND flash, multiple pancaked dies are wired out through their edges. You don't need as many pins to talk to a NAND flash die, as say a GPU die. For complex dies, designers have to pass thousands of pins through the "bottom layer," the connecting substrate, and eventually to the "top layer." The bottom layer hence is bumped out on both ends, one side interfacing with the package substrate for both dies, and the top side serving as a sort of substrate for the top die. This innovation is what TSMC calls "thru-silicon-vias" or TSVs.
WoW (wafer on wafer) is different from package-on-package or PoP (the way SoCs and DRAM packages are mated inside cellphones), in which two complete packages are wired out either concentrically on the PCB, or with pins on top of the SoC package interfacing with the DRAM package. The DRAM package needs fewer pins than the SoC, so it's more convenient having that on top. A WoW die sits inside a single package, and offers double the die area of a planar single-layer die. The bonding layers, the other key innovation of TSMC, not only helps mate the two packages, but also helps with thermal conductivity. There's division of labor between the two dies. The bottom layer has to bear the wiring of both dies, while the top layer has to dissipate heat from both dies. In this regard, the top layer gets some help from the fact that it has blank areas (where the bottom layer would normally have bumps to the package substrate). Source: Cadence Blog
Add your own comment

8 Comments on TSMC to Bring 3D Stacked Wafers to Complex Silicon Designs, Such as GPUs

#1
Vayra86
Interesting, very curious how these will handle the heat in the end and whether any form of overclocking is still going to be in the cards. Getting the idea that the TDP allowance is going to lose a lot of headroom...
Posted on Reply
#2
BiggieShady
Vayra86, post: 3837095, member: 152404"
Interesting, very curious how these will handle the heat in the end and whether any form of overclocking is still going to be in the cards. Getting the idea that the TDP allowance is going to lose a lot of headroom...
Fill the whole bottom floor with cache, and put the rest of the cpu on top? ... or with gpus, memory controllers, rops and cache on the bottom floor, shader cores on the top
Posted on Reply
#3
blindwrite
It looks more like Hybrid Bonding and TSV combined (both already largely proven). Nice to see support form EDA vendors
Posted on Reply
#4
evernessince
BiggieShady, post: 3837226, member: 102776"
Fill the whole bottom floor with cache, and put the rest of the cpu on top? ... or with gpus, memory controllers, rops and cache on the bottom floor, shader cores on the top
Latency is an issue. You cannot put cache on a separate die, which requires extremely low latency. AMD experiences up to 250ns of latency die to die. For something as commenly used as cache, putting it on a seperate die would completely destroy performance. There is a reason cache is right next to the CPU. Ditto goes for GPUs as well.
Posted on Reply
#5
BiggieShady
evernessince, post: 3837383, member: 165920"
Latency is an issue. You cannot put cache on a separate die, which requires extremely low latency. AMD experiences up to 250ns of latency die to die. For something as commenly used as cache, putting it on a seperate die would completely destroy performance. There is a reason cache is right next to the CPU. Ditto goes for GPUs as well.
That high? ... What's the point then ... what wouldn't suffer from 250ns of latency ... RAM access from separate NUMA node is in range of 100ns
Posted on Reply
#6
Vayra86
BiggieShady, post: 3837226, member: 102776"
Fill the whole bottom floor with cache, and put the rest of the cpu on top? ... or with gpus, memory controllers, rops and cache on the bottom floor, shader cores on the top
That's why I'm intrigued, this will vastly complicate any kind of overclocking, its no longer a flat surface that gets direct cooling, but temps will vary wildly throughout the chip and you only need one bit to limit potential of the entire thing.
Posted on Reply
#7
TheGuruStud
Vayra86, post: 3837450, member: 152404"
That's why I'm intrigued, this will vastly complicate any kind of overclocking, its no longer a flat surface that gets direct cooling, but temps will vary wildly throughout the chip and you only need one bit to limit potential of the entire thing.
Yeah, I doubt this can work for anything over 50 watts.
Posted on Reply
#8
Pap1er
TheGuruStud, post: 3837469, member: 42692"
Yeah, I doubt this can work for anything over 50 watts.
I assume that making matrix of few smaller chips with small power dissipation would make sense. Also cooling of chip grid might be easier due to larger dissipation area.
You know, like AMD already did with it's latest CPUs...
Posted on Reply