• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA H100 AI Performance Receives up to 54% Uplift with Optimizations

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,254 (0.92/day)
On Wednesday, the MLCommons team released the MLPerf 3.0 Inference numbers, and there was an exciting submission from NVIDIA. Reportedly, NVIDIA has used software optimization to improve the already staggering performance of its latest H100 GPU by up to 54%. For reference, NVIDIA's H100 GPU first appeared on MLPerf 2.1 back in September of 2022. In just six months, NVIDIA engineers worked on AI optimizations for the MLPerf 3.0 release to find that basic software optimization can catalyze performance increases anywhere from 7-54%. The workloads for measuring the inferencing speed suite included RNN-T speech recognition, 3D U-Net medical imaging, RetinaNet object detection, ResNet-50 object classification, DLRM recommendation, and BERT 99/99.9% natural language processing.

What is interesting is that NVIDIA's submission is a bit modified. There are open and closed categories that vendors have to compete in, where closed is the mathematical equivalent of a neural network. In contrast, the open category is flexible and allows vendors to submit results based on optimizations for their hardware. The closed submission aims to provide an "apples-to-apples" hardware comparison. Given that NVIDIA opted to use the closed category, performance optimization of other vendors such as Intel and Qualcomm are not accounted for here. Still, it is interesting that optimization can lead to a performance increase of up to 54% in NVIDIA's case with its H100 GPU. Another interesting takeaway is that some comparable hardware, like Qualcomm Cloud AI 100, Intel Xeon Platinum 8480+, and NeuChips's ReccAccel N3000, failed to finish all the workloads. This is shown as "X" on the slides made by NVIDIA, stressing the need for proper ML system software support, which is NVIDIA's strength and an extensive marketing claim.



View at TechPowerUp Main Site | Source
 
Joined
Jan 5, 2006
Messages
17,966 (2.68/day)
System Name AlderLake / Laptop
Processor Intel i7 12700K P-Cores @ 5Ghz / Intel i3 7100U
Motherboard Gigabyte Z690 Aorus Master / HP 83A3 (U3E1)
Cooling Noctua NH-U12A 2 fans + Thermal Grizzly Kryonaut Extreme + 5 case fans / Fan
Memory 32GB DDR5 Corsair Dominator Platinum RGB 6000MT/s CL36 / 8GB DDR4 HyperX CL13
Video Card(s) MSI RTX 2070 Super Gaming X Trio / Intel HD620
Storage Samsung 980 Pro 1TB + 970 Evo 500GB + 850 Pro 512GB + 860 Evo 1TB x2 / Samsung 256GB M.2 SSD
Display(s) 23.8" Dell S2417DG 165Hz G-Sync 1440p / 14" 1080p IPS Glossy
Case Be quiet! Silent Base 600 - Window / HP Pavilion
Audio Device(s) Panasonic SA-PMX94 / Realtek onboard + B&O speaker system / Harman Kardon Go + Play / Logitech G533
Power Supply Seasonic Focus Plus Gold 750W / Powerbrick
Mouse Logitech MX Anywhere 2 Laser wireless / Logitech M330 wireless
Keyboard RAPOO E9270P Black 5GHz wireless / HP backlit
Software Windows 11 / Windows 10
Benchmark Scores Cinebench R23 (Single Core) 1936 @ stock Cinebench R23 (Multi Core) 23006 @ stock
Does it play the last of us?..... :D
 
Last edited:
Joined
Sep 1, 2020
Messages
2,057 (1.52/day)
Location
Bulgaria
Screenshot_2023-04-06-19-58-02-93_40deb401b9ffe8e1df2f1cc5ba480b12.jpg
What is this element?
 
Joined
Nov 4, 2005
Messages
11,733 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.

Are the results not online yet? According to the official website the Qualcomm AI100 was processing 124K images a second VS 108K per second for the H100
 
Joined
Sep 10, 2018
Messages
5,536 (2.67/day)
Location
California
System Name His & Hers
Processor R7 5800X/ R9 5950X Stock
Motherboard X570 Aorus Master/ROG Crosshair VIII Hero
Cooling Corsair h150 elite/ Corsair h115i Platinum
Memory 32 GB 4x8GB 4000CL15 Trident Z Royal/ 32 GB 3200 CL14 @3800 CL16 Team T Force Nighthawk
Video Card(s) Evga FTW 3 Ultra 3080ti/ Gigabyte Gaming OC 4090
Storage lots of SSD.
Display(s) LG G2 65/LG C1 48/ LG 27GP850/ MSI 27 inch VA panel 1440p165hz
Case 011 Dynamic XL/ Phanteks Evolv X
Audio Device(s) Arctis Pro + gaming Dac/ Corsair sp 2500/ Logitech G560/Samsung Q990B
Power Supply Seasonic Ultra Prime Titanium 1000w/850w
Mouse Logitech G502 Lightspeed/ Logitech G Pro Hero.
Keyboard Corsair K95 RGB Platinum/ Logitech G Pro
Joined
Apr 13, 2022
Messages
995 (1.30/day)
They should rename it to The Last of 8GB Cards.

Any card under 12gb is already outdated by current consoles. Welcome to PC! The second tier red head headed step child of gaming land with 1080p, 60hz, and mid or low details.
 
Joined
Jun 22, 2006
Messages
1,053 (0.16/day)
System Name Beaver's Build
Processor AMD Ryzen 9 5950X
Motherboard Asus ROG Crosshair VIII Hero (WI-FI) - X570
Cooling Corsair H115i RGB PLATINUM 97 CFM Liquid
Memory G.Skill Trident Z Neo 32 GB (2 x 16 GB) DDR4-3600 Memory - 16-19-19-39
Video Card(s) NVIDIA GeForce RTX 4090 Founders Edition
Storage Inland 1TB NVMe M.2 (Phison E12) / Samsung 950 Pro M.2 NVMe 512G / WD Black 6TB - 256M cache
Display(s) Alienware AW3225QF 32" 4K 240 Hz OLED
Case Fractal Design Design Define R6 USB-C
Audio Device(s) Focusrite 2i4 USB Audio Interface
Power Supply SuperFlower LEADEX TITANIUM 1600W
Mouse Razer DeathAdder V2
Keyboard Razer Cynosa V2 (Membrane)
Software Microsoft Windows 10 Pro x64
Benchmark Scores 3dmark = https://www.3dmark.com/spy/32087054 Cinebench R15 = 4038 Cinebench R20 = 9210
A finger to show the size, or if you want to have a bit of fun it's a tiny cock from someone in a leather jacket.
I see an entire humanoid figure holding a smartphone to take the picture and not just a finger as highlighted
 
Top