• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
37,666 (8.51/day)
Location
Hyderabad, India
Processor AMD Ryzen 7 2700X
Motherboard ASUS ROG Strix B450-E Gaming
Cooling AMD Wraith Prism
Memory 2x 16GB Corsair Vengeance LPX DDR4-3000
Video Card(s) AMD Radeon RX 5700 XT
Storage Western Digital Black NVMe 512GB
Display(s) Samsung U28D590 28-inch 4K UHD
Case Corsair Carbide 100R
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Antec EarthWatts Pro Gold 750W
Mouse Razer Abyssus
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro
NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.



View at TechPowerUp Main Site
 
Joined
Feb 2, 2013
Messages
19 (0.01/day)
Noob question: how would you calculate peak theoretical TFLOPs based on the listed specs?

Is it 5120 shaders x 2 x 1.3 GHz clock = 14 TFLOPs? Or am I missing something?
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,348 (0.85/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-9900K @ 5.0GHz
Motherboard Gigabyte Z390 AORUS Ultra
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 16GB DDR4-3200
Video Card(s) Zotac GeForce RTX 2080 Ti Triple Fan @ 2040MHz
Storage Samsung 970 PRO 512GB + Crucial MX300 512GB + OCZ Vertex 4 256GB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G933 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G900
Keyboard Logitech G910
Software Windows 10 Pro
Benchmark Scores FFXV: 19329
Joined
May 29, 2017
Messages
487 (0.54/day)
980 ti to 1080 was 50% tflop increase for 30% extra performance. Now we really only have 12 to 15 tflops increase drom p6000 to tesla v100 with SXM2 so the v6000 would have 15.5-ish probably. That's an increase of just 30% in tflops. If the v104 die also gets more CUDA cores, it will have 30% more tflops than a 1080. That basically puts it at 1080 ti computing power and even with optimizations it's likely only going to be 10% faster or so or maybe even less. Best case scenario going from 9.3 to 14 tflops for the pcie variants of tesla cards, even though the p100 hasn't got the full 3840 CUDA cores, that still only gives gv104 13.5tflops and we have seen diminishing returns tflops to framerate, so essentialy that's still only 25% increased framerates over the 1080 ti and really though you should compare 3840 pascal to 5120 volta, so that drops the improvement in framerate down to 20% if you would compare the top of the line tesla pascal that doesn't exist and volta tesla cards to each other.

10-20% increased framerates then. Hmmm....
 
Joined
Sep 15, 2007
Messages
3,697 (0.83/day)
Location
Police/Nanny State of America
System Name More hardware than I use :|
Processor 4.7 8350 - 4.2 4560K - 4.4 4690K
Motherboard Sabertooth R2.0 - Gigabyte Z87X-UD4H-CF - AsRock Z97M KIller
Cooling Mugen 2 rev B push/pull - Hyper 212+ push/pull - Hyper 212+
Memory 16GB Gskill - 8GB Gskill - 16GB Ballistix 1.35v
Video Card(s) Xfire OCed 7950s - Powercolor 290x - Oced Zotac 980Ti AMP! (also have two 7870s)
Storage Crucial 250GB SSD, Kingston 3K 120GB, Sammy 1TB, various WDs, 13TB (actual capactity) NAS with WDs
Display(s) X-star 27" 1440 - Auria 27" 1440 - BenQ 24" 1080 - Acer 23" 1080
Case Lian Li open bench - Fractal Design ARC - Thermaltake Cube (still have HAF 932 and more ARCs)
Audio Device(s) Titanium HD - Onkyo HT-RC360 Receiver - BIC America custom 5.1 set up (and extra Klipsch sub)
Power Supply Corsair 850W V2 - EVGA 1000 G2 - Seasonic 500 and 600W units (dead 750W needs RMA lol)
Mouse Logitech G5 - Sentey Revolution Pro - Sentey Lumenata Pro - multiple wireless logitechs
Keyboard Logitech G11s - Thermaltake Challenger
Software I wish I could kill myself instead of using windows (OSX can suck it too).
I heard retail availability is august 2018 :p
 
Joined
May 29, 2017
Messages
487 (0.54/day)
I heard retail availability is august 2018 :p
Maybe! :)

Apparently geforce volta is going to use gddr5(x) and/or gddr6 though, so maybe they'll arrive in march 2018. Still, according to my research looking at tflop to framerate increases, 1080 ti to volta xx80 is likely to only be 10-20% higher framerates, rather than 980 ti to 1080 20-30% increased framerates. Not too spectacular then unless nvidia used a LOT of magic.
 
Joined
Dec 16, 2012
Messages
3,450 (1.36/day)
Location
Jyväskylä, Finland
System Name Classified
Processor AMD Ryzen 5 2600
Motherboard Asus TUF B450 Plus Gaming
Cooling Custom loop by Alphacool
Memory G.Skill Value 16GB DDR4-2400
Video Card(s) Asus Radeon R9 290 Direct Cu II OC @ 1100/1325
Storage 2x256GB, 240GB & 480GB SSDs, 500GB & 2TB HDDs
Display(s) 2x 1920x1080 (23" & 22")
Case Corsair Carbide Air 740
Audio Device(s) Sound Blaster Z
Power Supply Seasonic Focus+ Gold 750W
Mouse Logitech G400s
Keyboard Dell keyboard
Software Windows 10 Pro
Benchmark Scores 19545 in Fire Strike with 290 Crossfire OC
That colour reminds me of those Vega cards. :D
 
Joined
Dec 14, 2009
Messages
7,400 (2.04/day)
Location
Glasgow - home of formal profanity
System Name Newer Ho'Ryzen
Processor Ryzen 3700X
Motherboard Asus Crosshair VI Hero
Cooling TR Le Grand Macho
Memory 16Gb G.Skill 3200 RGB
Video Card(s) RTX 2080ti MSI Duke @2Ghz ish
Storage Samsumg 960 Pro m2. 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Lian Li PC-V33WX
Audio Device(s) On Board
Power Supply Seasonic Prime TItanium 850
Software W10
Benchmark Scores Look, it's a Ryzen on air........ What's the point?
Maybe! :)

Apparently geforce volta is going to use gddr5(x) and/or gddr6 though, so maybe they'll arrive in march 2018. Still, according to my research looking at tflop to framerate increases, 1080 ti to volta xx80 is likely to only be 10-20% higher framerates, rather than 980 ti to 1080 20-30% increased framerates. Not too spectacular then unless nvidia used a LOT of magic.
Nvidia has much magic. Your calculations will be rendered poop. :D
 
Joined
Dec 14, 2009
Messages
7,400 (2.04/day)
Location
Glasgow - home of formal profanity
System Name Newer Ho'Ryzen
Processor Ryzen 3700X
Motherboard Asus Crosshair VI Hero
Cooling TR Le Grand Macho
Memory 16Gb G.Skill 3200 RGB
Video Card(s) RTX 2080ti MSI Duke @2Ghz ish
Storage Samsumg 960 Pro m2. 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Lian Li PC-V33WX
Audio Device(s) On Board
Power Supply Seasonic Prime TItanium 850
Software W10
Benchmark Scores Look, it's a Ryzen on air........ What's the point?
Just looking at past releases and with less tflops improvement than usual, it's not looking too good for volta. Vega (2.0) now has a chance of beating it.
They have made changes to the warp schedulers, it looks very much like a parallel system, so that will increase performance, theoretically, in line with how Async helps non serial tasks.
Don't use hardware count to guess performance, on that metric, AMD should have been destroying Nvidia for years.
Nvidia has a very refined and streamlined architecture, it reaps rewards.
 
Joined
May 29, 2017
Messages
487 (0.54/day)
They have made changes to the warp schedulers, it looks very much like a parallel system, so that will increase performance, theoretically, in line with how Async helps non serial tasks.
Don't use hardware count to guess performance, on that metric, AMD should have been destroying Nvidia for years.
Nvidia has a very refined and streamlined architecture, it reaps rewards.
Unless all that new stuff really makes a huge difference, it's likely we'll see similar tflops/framerate ratios. New nvidia stuff basically only seems to make it not degrade massively over time and run well now, whereas AMD seems to degrade less and not be a huge fan of the now. Honestly I prefer AMD's methods since a rx480/580 will now beat a 1060 on average, not just because of drivers, but also because more and more games use dx12, vulkan, etc, so AMD stuff lasts relativwly long in the same price category. DiRT 4 btw really seems to favour AMD, since a rx 580 is now nearly as good as a 1070 in that game. FineWine technology for the win!
 
Joined
Mar 10, 2014
Messages
1,668 (0.80/day)
980 ti to 1080 was 50% tflop increase for 30% extra performance. Now we really only have 12 to 15 tflops increase drom p6000 to tesla v100 with SXM2 so the v6000 would have 15.5-ish probably. That's an increase of just 30% in tflops. If the v104 die also gets more CUDA cores, it will have 30% more tflops than a 1080. That basically puts it at 1080 ti computing power and even with optimizations it's likely only going to be 10% faster or so or maybe even less. Best case scenario going from 9.3 to 14 tflops for the pcie variants of tesla cards, even though the p100 hasn't got the full 3840 CUDA cores, that still only gives gv104 13.5tflops and we have seen diminishing returns tflops to framerate, so essentialy that's still only 25% increased framerates over the 1080 ti and really though you should compare 3840 pascal to 5120 volta, so that drops the improvement in framerate down to 20% if you would compare the top of the line tesla pascal that doesn't exist and volta tesla cards to each other.

10-20% increased framerates then. Hmmm....
P6000 is gp102, which have quite smaller core than GV100 and thus can manage much higher clocks on it's power envelope. You should really compare GV100 to GP100 products, like Tesla P100s(9.8 - 10.6 Tflops) or quadro gp100(10.3Tflops).
 
Joined
May 29, 2017
Messages
487 (0.54/day)
P6000 is gp102, which have quite smaller core than GV100 and thus can manage much higher clocks on it's power envelope. You should really compare GV100 to GP100 products, like Tesla P100s(9.8 - 10.6 Tflops) or quadro gp100(10.3Tflops).
Did that too.
 
Joined
Mar 10, 2014
Messages
1,668 (0.80/day)
Ahh, my bad. Note to myself: do not skim posts from forums...

Well if we assume that V6000 is full GV102 like P6000 is GP102, the "marketed"* Tflops for P6000 is 12TFlops. Thus clock speed is 12000/(60*64*2)=~1.56GHz. And then if V6000 is full GV102, with that clock speed "marketed"* TFlops would be 2*84*64*1.56 =~ 16.8 TFlops. And consider this: GV100 is huge and fat die(815mm²) and it's still keeping almost same clocks with same power envelope than GP100 with smaller 610mm² die. We just don't have enough information about rest of Volta family to know how much higher can clocks go when you can give more power to them.

*Nvidia marketed TFlops are calculated from given boost clock, which are actually lower than card is operating on normal 3D usage like gaming. I.E. 1.56GHz for pascal arch is very low frequency.
 
Top