• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,999 (1.07/day)
During the GTC 2022 keynote, NVIDIA announced its newest addition to the accelerator cards family. Called NVIDIA H100 accelerator, it is the company's most powerful creation ever. Utilizing 80 billion of TSMC's 4N 4 nm transistors, H100 can output some insane performance, according to NVIDIA. Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement. Additionally, new DPX instructions accelerate Dynamic Programming algorithms up to seven times over the previous A100 accelerator. Thanks to the new Hopper architecture, the Streaming Module structure has been optimized for better transfer of large data blocks.

The full GH100 chip implementation features 144 SMs, and 128 FP32 CUDA cores per SM, resulting in 18,432 CUDA cores at maximum configuration. The NVIDIA H100 GPU with SXM5 board form-factor features 132 SMs, totaling 16,896 CUDA cores, while the PCIe 5.0 add-in card has 114 SMs, totaling 14,592 CUDA cores. As much as 80 GB of HBM3 memory surrounds the GPU at 3 TB/s bandwidth. Interestingly, the SXM5 variant features a very large TDP of 700 Watts, while the PCIe card is limited to 350 Watts. This is the result of better cooling solutions offered for the SXM form-factor. As far as performance figures are concerned, the SXM and PCIe versions provide two distinctive figures for each implementation. You can check out the performance estimates in various precision modes below. You can read more about the Hopper architecture and what makes it special in this whitepaper published by NVIDIA.


View at TechPowerUp Main Site
 
I've become so numb, I can't feel you (referring to new launches) there...
 
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
 
N4 is an enhanced N5 with 6% smaller die area via optical shrink and lower complexity via mask-cost reduction so it is a cheaper N5 plus.
 
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
There's GPU availability now. The higher end 3000 series are still inflated, price wise but the 3070 TI can now be had for under $900 and is in stock now. What is the "supposed" launch price? $600 before all the OEMs said prices were going up regardless? We're almost there ...
 
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
No, you just aren't first priority and haven't been for some time.
 
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
I blame all the idiot companies putting "smart features" into toasters, flip flops, fridges, toilets and fragrance dispensers as much for the chip shortages as anyone.
 
Not really, the current generation had 7nm TSMC for A100 and 8nm Samsung for RTX 3000 series, so we don't know yet what will happen with RTX 4000 series.
It's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.
 
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
30 series has been available for months. The problem has been the price.
 
It's weird that the market that was for a long time the most profitable for Nvidia is not a priority. lol
Hasn't been first for nearly a decade. Profitability is a fickle thing.
 
Last edited:
It's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.
I've seen the leak claiming 4060-4090 being on 5nm as well, with AMD stuff being split into 5nm and 6nm because the former was RDNA3 and the latter NAVI2, so a different situation to NVIDIA. All in all it's just a leak and we'll have to wait for official announcements.
 
I really hope this is not indicative of desktop Ada Lovelace power consumption, though "I have a baaad feeling about this". :wtf:
 
For -20% performance, 48 TTFP64 vs 60 TTFP64 the power goes from 700W (NVLink) to 350W (PCI-express 5.0)
I really hate the increase in TDP but it isn't only Nvidia it seems to be industry wide due to process advancements.
Regarding Ada Lovelace i expect the $499-$449 (cut down AD104 5nm) part to have around +15% performance/W vs Navi 33 (6nm)
I read somewhere that AD106 & AD107 is 6nm not 5nm but i don't know if it's true.
Edit : it seems too big for 4nm (only -6% vs 5nm logic density scaling) for only 80 billion transistors, I'm way off with my calculations, I'm nearly 100mm² off, enough to house 240MB L3 cache, surely I'm doing something wrong.
 
Last edited:
have you ever heard of supply and demand
Scalpers and massive mining operations hoarding cards? yeah heard about those
 
"Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement."

Damn, I'll bet this performance monster can do 8K with no problem. I'm sure that the high end cards will also be reassuringly unaffordable, making any reviews academic.
 
Back
Top