• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Unwrapping the NVIDIA B200 and GB200 AI GPU Announcements

The consumer version of that chip will probably be very different rather than a simple cut down. Those datacenters GPU are generally pretty bad for gaming
Also the x100 variants lack display outputs as they are meant to be used as accelerators - even the PCIe variants.
 
Fp4! I dread to think about the type 1 and type 2 errors that can occur with ultra-low precision nibble Artificial Inference. It is such a blunt tool. If it’s a nail, it will work. If it’s a screw it won’t. And will the “users” of the Ai output have any clue
4bit inference is old hat at this point, commonly used to get parameter sets of LLMs small enough to run on client gpus. The networks are fine-tuned (i.e., re-trained) to operate at this precision precisely to minimize additional error.

A recent paper making waves proposes "1.58 bit" inference (i.e., single digit ternary arithmetic).
 
The biggest surprise is the use of N4P node. I thought for sure Nvidia was going to use 3nm by now, at least for these 20k+ costing chips.
This does not bode well for RTX 5000 series. I very much doubt those will use 3nm either.
I don't know, I think they could still get some sizable gains out of reordering the architecture alone. The 780 Ti and 980 Ti shared the same TSMC 28nm node but GM200 easily bested GK110B. This launch could still be disappointing however there is precedent for Jensen's team pulling a rabbit out of their collective hat while using the same lithography.

780 Ti:
1710862084871.png


980 Ti:
1710862099159.png
 
Back
Top