• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Reportedly Having Issues with Samsung's HBM3 Chips Running Too Hot

TheLostSwede

News Editor
Joined
Nov 11, 2004
Messages
18,934 (2.51/day)
Location
Sweden
System Name Overlord Mk MLI
Processor AMD Ryzen 7 7800X3D
Motherboard Gigabyte X670E Aorus Master
Cooling Noctua NH-D15 SE with offsets
Memory 32GB Team T-Create Expert DDR5 6000 MHz @ CL30-34-34-68
Video Card(s) Gainward GeForce RTX 4080 Phantom GS
Storage 1TB Solidigm P44 Pro, 2 TB Corsair MP600 Pro, 2TB Kingston KC3000
Display(s) Acer XV272K LVbmiipruzx 4K@160Hz
Case Fractal Design Torrent Compact
Audio Device(s) Corsair Virtuoso SE
Power Supply be quiet! Pure Power 12 M 850 W
Mouse Logitech G502 Lightspeed
Keyboard Corsair K70 Max
Software Windows 10 Pro
Benchmark Scores https://valid.x86.fr/yfsd9w
According to Reuters, NVIDIA is having some major issues with Samsung's HBM3 chips, as NVIDIA hasn't managed to finalise its validations of the chips. Reuters are citing multiple sources that are familiar with the matter and it seems like Samsung is having some serious issues with its HMB3 chips if the sources are correct. Not only do the chips run hot, which itself is a big issue due to NVIDIA already having issues cooling some of its higher-end products, but the power consumption is apparently not where it should be either. Samsung is said to have tried to get its HBM3 and HBM3E parts validated by NVIDIA since sometime in 2023 according to Reuter's sources, which suggests that there have been issues for at least six months, if not longer.

The sources claim there are issues with both the 8- and 12-layer stacks of HMB3E parts from Samsung, suggesting that NVIDIA might only be able to supply parts from Micron and SK Hynix for now, the latter whom has been supplying HBM3 chips to NVIDIA since the middle of 2022 and HBM3E chips since March of this year. It's unclear if this is a production issue at Samsung's DRAM Fabs, a packaging related issue or something else entirely. The Reuter's piece goes on to speculating about Samsung not having had enough time to develop its HBM parts compared its competitors and that it's a rushed product, but Samsung issued a statement to the publication that it's a matter of customising the product for its customer's needs. Samsung also said that it's "the process of optimising its products through close collaboration with customers" without going into which customer(s). Samsung issued a further statement saying that "claims of failing due to heat and power consumption are not true" and that testing was going as expected.



View at TechPowerUp Main Site | Source
 
Thanks to @P4-630 for the heads up.
 
Low quality post by bonehead123
sobs, tears, condolences....

poor nGreeddiya, wheh wheh wheh, gonna have to spend some of that jacket money to fix a problem that Sammy created for them, wheh, wheh, wheh....

Shoulda just stuck with TSMC... oh but wait, Apple beat them to it, wheh, wheh, wheh

Ha Ha Ha Lol GIF by Lucas and Friends by RV AppStudios
 
sobs, tears, condolences....

poor nGreeddiya, wheh wheh wheh, gonna have to spend some of that jacket money to fix a problem that Sammy created for them, wheh, wheh, wheh....

Shoulda just stuck with TSMC... oh but wait, Apple beat them to it, wheh, wheh, wheh

Ha Ha Ha Lol GIF by Lucas and Friends by RV AppStudios

Not like Apple has much of a choice… imagine them being forced to use Samsung fabs!!
 
sobs, tears, condolences....

poor nGreeddiya, wheh wheh wheh, gonna have to spend some of that jacket money to fix a problem that Sammy created for them, wheh, wheh, wheh....

Shoulda just stuck with TSMC... oh but wait, Apple beat them to it, wheh, wheh, wheh

Ha Ha Ha Lol GIF by Lucas and Friends by RV AppStudios
LOL this has nothing to do with TSMC vs Samsung. Nvidia does not design their own HBM and they are not the only one sourcing HBM from Samsung. All these players making AI ASIC are sourcing HBM from 2-3 vendors.
 
Yeah imagine using the "fabs" of the biggest memory & NAND maker in the world :rolleyes:

I almost forgot how much Apple and Samsung get along! :p It’s one thing to buy flash memory and displays from them but entirely another to hand them the blueprints and allow them to control the manufacturing of the SOC which they directly compete against… :kookoo:
 
Not like Apple has much of a choice… imagine them being forced to use Samsung fabs!!
At this point, that the only thing Apple ISN’T using from Samesung. They pretty much use screens, memory, ram, and I’m pretty sure I’m forgetting something else related to Samesung.
 
Whatever the extra cost they will just happily push it downstream. lol
 
Probably explains why Samsung Electronics shuffled their leadership. Being second fiddle to SK Hynix is a bitter pill.
 
SK Hynix has always led HBM production, with better yields and volume. This is not new...
What is new is the sudden and dramatic divergence in fortunes between HBM powering the AI boom and your run off the mill DRAMs which are languishing.
 
Ol' Jensen probably trying to squeeze a price cut out of Samsung. Hey man your chips run hot, we can still work with them, but it'll cost us a pretty penny to mitigate heat, so you better slash pricing.
 
ah yes, the old "thermal density" problem is rearing its head again. Alongside probably a voltage/amperage curve problem.
 
Samsung and Intel should exchange information on how to manufacture power efficient chips........... :p
 
I suspect, potential long term degradation. What Samsung reports as working, in DC enviroments with 24/7 usage and perhaps for the next years, this memory will show degradation and thus growing amount of damaged products and returns. You can only expect the highest grade stuff when you pay 10 to 30g per unit.

The Vega 56 with a flashed 64 bios was known for that. Long term HBM would degrade.
 
AMD had the same problem with HBM2 for the Vega64. Not to many working units these days.

Yeah, all so i be very weary about buying any when they "fix" the issue too. Remember as long as it lasts the warranty.
 
Trying to run the chips at too much clock speed, just slow them down lol.
 
Thats not really the issue
It is part of the problem, but SK Hynix has shaped to be extremely efficient in producing HBM over time, Samsung and Micron simply failed to follow. They may even seem to equate on Specs, but not in the efficiency/yields


"According to Choi, the company's head of packaging and testing, SK Hynix's proprietary MR-MUF is a key technology in HBM packaging. MR-MUF reduces chip stacking pressure by 6%, increases productivity by fourfold by reducing process time, and improves heat dissipation by 45% compared to earlier technologies.

SK Hynix recently released an advanced MR-MUF that improves heat dissipation by 10% through the use of a new protective material while keeping the existing advantages of MR-MUF, Choi said. Advanced MR-MUF is an optimum solution for high stacking, and technology development for 16-high stacking is underway. The company plans to utilize advanced MR-MUF achieving 16-high HBM4, while preemptively reviewing hybrid bonding technology."


"SK hynix currently controls roughly 46% - 49% of HBM market, and its share is not expected to drop significantly in 2025, according to market tracking firm TrendForce. By contrast, Micron's share on HBM memory market is between 4% and 6%."

SK hynix Reports That 2025 HBM Memory Supply Has Nearly Sold Out (anandtech.com)
 
I imagine Jensen was watching SNL and he was like we need more COWBELL, er HBM.

giphy.gif
 
Always fun to see the users who can't resist the opportunity for name calling and a good roast pop up, this seems pertinent to the topic though;
In separate statements after Reuters first published this report, Samsung said that "claims of failing due to heat and power consumption are not true," and that testing was "proceeding smoothly and as planned."
Media hungry as always to jump on any possible chance for coverage of a miss-step. After all, at the top of their game, all eyes are on them and there's no doubt a crowd hungry to feast on a meal of schadenfreude, as this thread demonstrates.
 
Always fun to see the users who can't resist the opportunity for name calling and a good roast pop up, this seems pertinent to the topic though;

Media hungry as always to jump on any possible chance for coverage of a miss-step. After all, at the top of their game, all eyes are on them and there's no doubt a crowd hungry to feast on a meal of schadenfreude, as this thread demonstrates.
Hey. We come here for tech power-ups, and to no surprise, we get tech power-ups here.
 
Back
Top