• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

GDDR6x heat issues and rising TBP: why not HBM2?

Joined
Dec 12, 2020
Messages
1,755 (1.09/day)
I'd think it would make more sense for Nvidia than AMD because Nvidia is the one fighting higher and higher TBP.

According to this article (and the accompanying video):
https://www.extremetech.com/computing/289391-hbm2-vs-gddr6-new-video-compares-contrasts-memory-types

"Data rates on GDDR6 are much higher per-pin, but there are far fewer pins overall. The amount of area dedicated to the PHY (the circuitry required to implement the actual physical connection) and the power costs required. The PHY area is 1.5x – 1.75x larger, while the power cost can be 3.5x – 4.5x higher. (i.e. than HBM2)"

HBM2 has higher bandwidth and much less power overhead, why continue beating the power hungry horse that is GDDR anymore?
 
exactly. Plus they fail too. sooo many dead AMD VEGA cards...
Interesting every Vega card I had is still running fine
 
Interesting every Vega card I had is still running fine
I'm not sure if its just gaming, but any VEGA / Radeon 7 that was mining pretty much died within 2 years. Now miners aren't know to be nice to video cards...so heat is probably the overall issue. After all the HBM is on the die as well. Lots of heat generated in small area.
 
I'd think it would make more sense for Nvidia than AMD because Nvidia is the one fighting higher and higher TBP.

According to this article (and the accompanying video):
https://www.extremetech.com/computing/289391-hbm2-vs-gddr6-new-video-compares-contrasts-memory-types

"Data rates on GDDR6 are much higher per-pin, but there are far fewer pins overall. The amount of area dedicated to the PHY (the circuitry required to implement the actual physical connection) and the power costs required. The PHY area is 1.5x – 1.75x larger, while the power cost can be 3.5x – 4.5x higher. (i.e. than HBM2)"

HBM2 has higher bandwidth and much less power overhead, why continue beating the power hungry horse that is GDDR anymore?

If you use HBM, you can't really separate the VRAM from the core anymore. It all goes as one unit, you need an interposer (if you're Intel then you might use additional expensive tech like EMIB to increase perf), package size gets even bigger (like Ampere wasn't big enough), and stuff gets expensive/complex real fast. Manufacturing on the expensive processes we're used to now, companies try to reduce, not increase die size.

Then there's the usual challenge of ensuring level contact surface across HBM stacks and core.

Did you mean TDP?

Total Board Power, because on the GDDR6X cards the memory takes up a lot more power
 
I'm not sure if its just gaming, but any VEGA / Radeon 7 that was mining pretty much died within 2 years. Now miners aren't know to be nice to video cards...so heat is probably the overall issue. After all the HBM is on the die as well. Lots of heat generated in small area.
If I was going to use a Vega card for Mining, it was a necessity to re-paste the Thermal Paste every 3-6 months due to the concentrated heat of that. Having said that that was also the beauty of Water cooling Vega.
 
If you use HBM, you can't really separate the VRAM from the core anymore. It all goes as one unit, you need an interposer (if you're Intel then you might use additional expensive tech like EMIB to increase perf), package size gets even bigger (like Ampere wasn't big enough), and stuff gets expensive/complex real fast. Manufacturing on the expensive processes we're used to now, companies try to reduce, not increase die size.
Wouldn't that be mitigated by a chiplet-based approach? Heck, AMD used GF 12nm I/O dies paired with TSMC 7nm CCD's. A HBM powered MCM style GPU may have a slightly higher latency if integrated into the package vs using an interposer, but that would solve the extra costs from interposer + expensive process + reduced yields. However, I reckon one of the most expensive changes would remain - a massive memory bus.
 
Wouldn't that be mitigated by a chiplet-based approach? Heck, AMD used GF 12nm I/O dies paired with TSMC 7nm CCD's. A HBM powered MCM style GPU may have a slightly higher latency if integrated into the package vs using an interposer, but that would solve the extra costs from interposer + expensive process + reduced yields. However, I reckon one of the most expensive changes would remain - a massive memory bus.

And how does that make packaging any smaller, easier or less expensive in any way? :confused:

The point of interposer (fast but expensive) or EMIB (less expensive) is to accommodate bandwidth and latency needs. Kaby Lake-G used early EMIB for that reason. The substrate-based MCM we have right now (AM4) is crap compared to the kind of performance the GDDR6X GPUs need. AMD does not yet use chiplets and links over substrate for any GPU products, and AM5's chiplet "iGPU" performance is not worth talking about.

Maybe that will change when AMD brings MCM to RDNA and if they introduce new Fabric tech for AM5, but that still doesn't change the other packaging obstacles of HBM.
 
Last edited:
And how does that make packaging any smaller, easier or less expensive in any way? :confused:
Well, I was thinking that it would reduce the need for the following:

1. Massive individual dice
2. Building the VRAM on the same mode as GPU
3. Reduced die yields due to increased complexity
4. Overdependence on a single supplier.
The substrate-based MCM we have right now (AM4) is crap compared to the kind of performance GPUs need.
The substrate-based MCM we have right now (AM4) is crap compared to the kind of performance GPUs need.
Can you quantify this? I realize that the use case is different, but I can't see that a package would have lower performance than the PCB mounted GDDR chips, while assisting in reducing costs.

Another aspect - board sizes would reduce considerably, as would costs due to the lower number of traces for the memory.chips
A drop in the bucket, but not intangible.
 

But this is what you should be getting with today's cost. Look at the Vega cards what they cost at that time & it should be cheaper now. GDDR6(x) must be cheaper so bigger profit for the company's. Would most user now be happy to pay 800 to 1400 USD to switch to HBM? other than what they are paying now for GDDR6(x). This is why I have never switched. ...R9 Nano/Vega Nano user.

exactly. Plus they fail too. sooo many dead AMD VEGA cards...

That's because you have more on one package that can fail. The more you stuff & integrate into a single package & you overclock it, the more likely something will fail. Does this ring a bell, the latest processor from AMD that have 3D stack memory where all overclocking was removed from the BIOS.

EDIT: @OP Why not HBM3 Then you will not need to overclock it.
 
Last edited:
3. Reduced die yields due to increased complexity
4. Overdependence on a single supplier.

Wasn't Vega 56/64 dual sourced?

The HBM itself isn't getting magically cheaper or less complex either. It's still stacks connected with TSV. Delicate/complex/expensive however you cut it, G6X is still just a normal ass memory module.

Can you quantify this? I realize that the use case is different, but I can't see that a package would have lower performance than the PCB mounted GDDR chips, while assisting in reducing costs.

At the vaguest level, I don't think all these companies would invest so much looking into novel fast interconnects if ol' IFOP was fast enough a solution for everything.

As for cost, everything is expensive with HBM, so doubt removing the interposer makes it cost-effective compared to G6 or G6X packages.

The current IFOP links haven't changed since they were introduced. They're narrow and were designed for cost-effective (what was the quoted spec, like 40GB/s at a reasonable Fabric clock?). As for latency, well, we all know at this point. Latency appears to be at parity with regular Fabric now but only for 1CCD, add 2CCD and it's still a big hit. And regular Fabric itself was also mostly cost-effective than "fast"...

In any case, AMD isn't the one using G6X, and Nvidia seem like the kind of company that would look to a new memory technology instead of MCM to solve G6X's problems

Another aspect - board sizes would reduce considerably

Would they? The Vega and VII PCBs were regular size, except for the rare ones that looked like they reuse R9 Nano PCB. The GDDR6X FEs have some of the tiniest PCBs.

I'm not sure what kind of advantage smaller board size offers. The cards still need to be honking big to be cooled, plenty of GPUs from both camps already have small PCBs to allow flow-through. If board size was that much an incentive, figure more AIBs would already have taken the FE approach

I feel like if G6X just came with a metal package (think Fermi or A12X IHS), most of these problems would be solved.
 
Last edited:
@tabascosauz
So your proposal would be GDDR6x dies would have a IHS instead of being buried in a substrate?

No, the IHS suggestion is for G6X thermal issues. Someone already ghetto modded a 3070 Ti with copper heatspreaders for all the G6X packages, to great effect. Same idea with "thermally enhanced" powerstages (Intersil ISL99227B) and metal package MOSFETs (IR DirectFET, etc). When something has a plastic package, it's thermally resistive and it's easier to dissipate heat through the back (the PCB) instead.

The buried in a substrate refers to AMD's IFOP (Infinity Fabric Over Package), which is what connects the dies in their chiplet products, doesn't have anything to do with HBM. HBM is usually either connected with an interposer or EMIB.
 
@tabascosauz
So that would help to eliminate the heat problems of GDDR6x parts but it wouldn't reduce the power requirements vis-a-vis HBM2.

It's depressing that AMD has completely given up on HBM2 for the consumer market at least they were willing to try something different for VRAM.
 
Back
Top