• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,362 (7.68/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.



View at TechPowerUp Main Site
 
Joined
Feb 2, 2013
Messages
19 (0.00/day)
Noob question: how would you calculate peak theoretical TFLOPs based on the listed specs?

Is it 5120 shaders x 2 x 1.3 GHz clock = 14 TFLOPs? Or am I missing something?
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,562 (0.59/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-12900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 32GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G935 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
D

Deleted member 172152

Guest
980 ti to 1080 was 50% tflop increase for 30% extra performance. Now we really only have 12 to 15 tflops increase drom p6000 to tesla v100 with SXM2 so the v6000 would have 15.5-ish probably. That's an increase of just 30% in tflops. If the v104 die also gets more CUDA cores, it will have 30% more tflops than a 1080. That basically puts it at 1080 ti computing power and even with optimizations it's likely only going to be 10% faster or so or maybe even less. Best case scenario going from 9.3 to 14 tflops for the pcie variants of tesla cards, even though the p100 hasn't got the full 3840 CUDA cores, that still only gives gv104 13.5tflops and we have seen diminishing returns tflops to framerate, so essentialy that's still only 25% increased framerates over the 1080 ti and really though you should compare 3840 pascal to 5120 volta, so that drops the improvement in framerate down to 20% if you would compare the top of the line tesla pascal that doesn't exist and volta tesla cards to each other.

10-20% increased framerates then. Hmmm....
 
Joined
Sep 15, 2007
Messages
3,944 (0.65/day)
Location
Police/Nanny State of America
Processor OCed 5800X3D
Motherboard Asucks C6H
Cooling Air
Memory 32GB
Video Card(s) OCed 6800XT
Storage NVMees
Display(s) 32" Dull curved 1440
Case Freebie glass idk
Audio Device(s) Sennheiser
Power Supply Don't even remember
I heard retail availability is august 2018 :p
 
D

Deleted member 172152

Guest
I heard retail availability is august 2018 :p
Maybe! :)

Apparently geforce volta is going to use gddr5(x) and/or gddr6 though, so maybe they'll arrive in march 2018. Still, according to my research looking at tflop to framerate increases, 1080 ti to volta xx80 is likely to only be 10-20% higher framerates, rather than 980 ti to 1080 20-30% increased framerates. Not too spectacular then unless nvidia used a LOT of magic.
 

Keullo-e

S.T.A.R.S.
Joined
Dec 16, 2012
Messages
11,011 (2.66/day)
Location
Finland
System Name 4K-gaming
Processor AMD Ryzen 7 5800X up to 5.05GHz
Motherboard Gigabyte B550M Aorus Elite
Cooling Custom loop (CPU+GPU, 240 & 120 rads)
Memory 32GB Kingston HyperX Fury @ DDR4-3466
Video Card(s) PowerColor RX 6700 XT Fighter OC/UV
Storage ~4TB SSD + 6TB HDD
Display(s) Acer 27" 4K120 IPS + Lenovo 32" 4K60 IPS
Case Corsair 4000D Airflow White
Audio Device(s) Asus TUF H3 Wireless
Power Supply EVGA Supernova G2 750W
Mouse Logitech MX518
Keyboard Roccat Vulcan 121 AIMO
VR HMD Oculus Rift CV1
Software Windows 11 Pro
Benchmark Scores It runs Crysis remastered at 4K
That colour reminds me of those Vega cards. :D
 

the54thvoid

Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
12,450 (2.38/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Maybe! :)

Apparently geforce volta is going to use gddr5(x) and/or gddr6 though, so maybe they'll arrive in march 2018. Still, according to my research looking at tflop to framerate increases, 1080 ti to volta xx80 is likely to only be 10-20% higher framerates, rather than 980 ti to 1080 20-30% increased framerates. Not too spectacular then unless nvidia used a LOT of magic.

Nvidia has much magic. Your calculations will be rendered poop. :D
 
D

Deleted member 172152

Guest
Nvidia has much magic. Your calculations will be rendered poop. :D
Just looking at past releases and with less tflops improvement than usual, it's not looking too good for volta. Vega (2.0) now has a chance of beating it.
 

the54thvoid

Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
12,450 (2.38/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Just looking at past releases and with less tflops improvement than usual, it's not looking too good for volta. Vega (2.0) now has a chance of beating it.

They have made changes to the warp schedulers, it looks very much like a parallel system, so that will increase performance, theoretically, in line with how Async helps non serial tasks.
Don't use hardware count to guess performance, on that metric, AMD should have been destroying Nvidia for years.
Nvidia has a very refined and streamlined architecture, it reaps rewards.
 
D

Deleted member 172152

Guest
They have made changes to the warp schedulers, it looks very much like a parallel system, so that will increase performance, theoretically, in line with how Async helps non serial tasks.
Don't use hardware count to guess performance, on that metric, AMD should have been destroying Nvidia for years.
Nvidia has a very refined and streamlined architecture, it reaps rewards.
Unless all that new stuff really makes a huge difference, it's likely we'll see similar tflops/framerate ratios. New nvidia stuff basically only seems to make it not degrade massively over time and run well now, whereas AMD seems to degrade less and not be a huge fan of the now. Honestly I prefer AMD's methods since a rx480/580 will now beat a 1060 on average, not just because of drivers, but also because more and more games use dx12, vulkan, etc, so AMD stuff lasts relativwly long in the same price category. DiRT 4 btw really seems to favour AMD, since a rx 580 is now nearly as good as a 1070 in that game. FineWine technology for the win!
 
Joined
Mar 10, 2014
Messages
1,793 (0.49/day)
980 ti to 1080 was 50% tflop increase for 30% extra performance. Now we really only have 12 to 15 tflops increase drom p6000 to tesla v100 with SXM2 so the v6000 would have 15.5-ish probably. That's an increase of just 30% in tflops. If the v104 die also gets more CUDA cores, it will have 30% more tflops than a 1080. That basically puts it at 1080 ti computing power and even with optimizations it's likely only going to be 10% faster or so or maybe even less. Best case scenario going from 9.3 to 14 tflops for the pcie variants of tesla cards, even though the p100 hasn't got the full 3840 CUDA cores, that still only gives gv104 13.5tflops and we have seen diminishing returns tflops to framerate, so essentialy that's still only 25% increased framerates over the 1080 ti and really though you should compare 3840 pascal to 5120 volta, so that drops the improvement in framerate down to 20% if you would compare the top of the line tesla pascal that doesn't exist and volta tesla cards to each other.

10-20% increased framerates then. Hmmm....

P6000 is gp102, which have quite smaller core than GV100 and thus can manage much higher clocks on it's power envelope. You should really compare GV100 to GP100 products, like Tesla P100s(9.8 - 10.6 Tflops) or quadro gp100(10.3Tflops).
 
D

Deleted member 172152

Guest
P6000 is gp102, which have quite smaller core than GV100 and thus can manage much higher clocks on it's power envelope. You should really compare GV100 to GP100 products, like Tesla P100s(9.8 - 10.6 Tflops) or quadro gp100(10.3Tflops).
Did that too.
 
Joined
Mar 10, 2014
Messages
1,793 (0.49/day)
Ahh, my bad. Note to myself: do not skim posts from forums...

Well if we assume that V6000 is full GV102 like P6000 is GP102, the "marketed"* Tflops for P6000 is 12TFlops. Thus clock speed is 12000/(60*64*2)=~1.56GHz. And then if V6000 is full GV102, with that clock speed "marketed"* TFlops would be 2*84*64*1.56 =~ 16.8 TFlops. And consider this: GV100 is huge and fat die(815mm²) and it's still keeping almost same clocks with same power envelope than GP100 with smaller 610mm² die. We just don't have enough information about rest of Volta family to know how much higher can clocks go when you can give more power to them.

*Nvidia marketed TFlops are calculated from given boost clock, which are actually lower than card is operating on normal 3D usage like gaming. I.E. 1.56GHz for pascal arch is very low frequency.
 
Top