• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GA100 Scalar Processor Specs Sheet Released

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,404 (0.82/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-9900K @ 5.0GHz
Motherboard Gigabyte Z390 AORUS Ultra
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 32GB DDR4-3200
Video Card(s) Zotac GeForce RTX 2080 Ti Triple Fan @ 2040MHz
Storage Samsung 970 PRO 512GB + Crucial MX300 512GB + OCZ Vertex 4 256GB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G935 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G900
Keyboard Logitech G915
Software Windows 10 Pro
Benchmark Scores FFXV: 19329
Joined
Nov 24, 2017
Messages
727 (0.79/day)
Location
Asia
Processor Intel Core i5 4590
Motherboard Gigabyte Z97x Gaming 3
Cooling Intel Stock Cooler
Memory 8GB(2x4GB) DDR3-800MHz [1600MT/s]
Video Card(s) XFX RX 560 4GB
Storage Transcend SSD370S 128GB; Toshiba DT01ACA100 1TB
Display(s) Samsung S20D300 20" 768p TN
Case Delux DLC-MV888
Audio Device(s) Realtek ALC1150
Power Supply Corsair VS450
Mouse A4Tech N-70FX
Software Windows 10 Pro
Benchmark Scores BaseMark GPU : 250 Point
400W??? Isn't Nvidia suppose to be efficient??
 
Joined
Mar 10, 2010
Messages
7,611 (2.04/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R7 3800X@4.350/525/ Intel 8750H
Motherboard Crosshair hero7 @bios 2703/?
Cooling 360EK extreme rad+ 360$EK slim all push, cpu Monoblock Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in two sticks./16Gb
Video Card(s) Sapphire refference Rx vega 64 EK waterblocked/Rtx 2060
Storage Samsung Nvme Pg981, silicon power 1Tb samsung 840 basic as a primocache drive for, WD2Tbgrn +3Tbgrn,
Display(s) Samsung UAE28"850R 4k freesync, LG 49" 4K 60hz ,Oculus
Case Lianli p0-11 dynamic
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Iksu force fx
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
There is no difference, A100 is just the Tesla name it uses a GA100
I don't know about no difference one's cut down and the price will vary , so they carry the same name though, weird.
 
Joined
Dec 22, 2011
Messages
3,131 (1.01/day)
System Name Zimmer Frame Rates
Processor Intel i7 920 @ Stock speeds baby
Motherboard EVGA X58 3X SLI
Cooling True 120
Memory Corsair Vengeance 12GB
Video Card(s) Palit GTX 980 Ti Super JetStream
Storage Of course
Display(s) Crossover 27Q 27" 2560x1440
Case Antec 1200
Audio Device(s) Don't be silly
Power Supply XFX 650W Core
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 10
Benchmark Scores Epic

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,404 (0.82/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-9900K @ 5.0GHz
Motherboard Gigabyte Z390 AORUS Ultra
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 32GB DDR4-3200
Video Card(s) Zotac GeForce RTX 2080 Ti Triple Fan @ 2040MHz
Storage Samsung 970 PRO 512GB + Crucial MX300 512GB + OCZ Vertex 4 256GB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G935 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G900
Keyboard Logitech G915
Software Windows 10 Pro
Benchmark Scores FFXV: 19329
I don't know about no difference one's cut down and the price will vary , so they carry the same name though, weird.
The different one will be GA102, No HBM but just as many cuda cores more or less.

But there won't be 2 different 100s, technically that is what the 102 is.
 
Joined
Nov 24, 2017
Messages
727 (0.79/day)
Location
Asia
Processor Intel Core i5 4590
Motherboard Gigabyte Z97x Gaming 3
Cooling Intel Stock Cooler
Memory 8GB(2x4GB) DDR3-800MHz [1600MT/s]
Video Card(s) XFX RX 560 4GB
Storage Transcend SSD370S 128GB; Toshiba DT01ACA100 1TB
Display(s) Samsung S20D300 20" 768p TN
Case Delux DLC-MV888
Audio Device(s) Realtek ALC1150
Power Supply Corsair VS450
Mouse A4Tech N-70FX
Software Windows 10 Pro
Benchmark Scores BaseMark GPU : 250 Point
Compared to what exactly?
Compared to AMD. Nvidia's 12nm GPUs have same efficiency of AMD's 7nm GPUs, as a result Nvidia's 7nm GPU's should be more efficient.
 
Joined
Dec 22, 2011
Messages
3,131 (1.01/day)
System Name Zimmer Frame Rates
Processor Intel i7 920 @ Stock speeds baby
Motherboard EVGA X58 3X SLI
Cooling True 120
Memory Corsair Vengeance 12GB
Video Card(s) Palit GTX 980 Ti Super JetStream
Storage Of course
Display(s) Crossover 27Q 27" 2560x1440
Case Antec 1200
Audio Device(s) Don't be silly
Power Supply XFX 650W Core
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 10
Benchmark Scores Epic
Compared to AMD. Nvidia's 12nm GPUs have same efficiency of AMD's 7nm GPUs, as a result Nvidia's 7nm GPU's should be more efficient.
So your comparing a 10.3 billion transistor Navi to a 54 billion transistor 40GB HBM2 HPC AI compute monster.

Got ya.
 
Last edited:
Joined
Oct 28, 2012
Messages
432 (0.16/day)
Processor AMD Ryzen 3700x
Motherboard asus ROG Strix B-350I Gaming
Cooling cooler master masterliquid 240
Memory Gskill Aegis 2x 8GB DDR4
Video Card(s) zotac amp GTX 1660 super
Storage WD sn550 1To/WD ssd sata 1To /Samsung 960 evo 256 Gb/Seagate 2To/WD book 4 To back-up
Display(s) LG 25UM58
Case Ncase M1
Audio Device(s) sennheiser HD58X
Power Supply bequiet SFX L power 500w
Mouse Logitech M590
Keyboard Master Key Mx
Software win 10 pro
400W??? Isn't Nvidia suppose to be efficient??
For what it's supposed to be the perf/watt ratio is actually great. A Single rack of a DGX A100 can replace several old racks.
From this :
1589478488398.png


To this:

1589478529647.png
 
Joined
Dec 22, 2011
Messages
3,131 (1.01/day)
System Name Zimmer Frame Rates
Processor Intel i7 920 @ Stock speeds baby
Motherboard EVGA X58 3X SLI
Cooling True 120
Memory Corsair Vengeance 12GB
Video Card(s) Palit GTX 980 Ti Super JetStream
Storage Of course
Display(s) Crossover 27Q 27" 2560x1440
Case Antec 1200
Audio Device(s) Don't be silly
Power Supply XFX 650W Core
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 10
Benchmark Scores Epic
A picture paints a thousand words, thank you.
 
Joined
Jan 8, 2017
Messages
5,033 (4.05/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
But there won't be 2 different 100s, technically that is what the 102 is.
But this one has an entire GPC disabled due to horrendous yields, I presume, and probably because it would throw even that eye watering 400W TDP out the window. There has to be one fully enabled chip right ? One would assume there would be different 100s.

To be honest this is borderline Thermi 2.0, a great compute architecture that can barley be implemented in actual silicon due to power and yields. These aren't exactly Nvidia's brightest hours in terms of chip design, it seems like they bit more than what they could chew, the chip was probably cut down in a last minute decision.

Suffice to say I doubt we'll see the full 8192 shaders in any GPU this generation, I doubt they could realistically fit that in a 250W power envelope and it seems like GA100 runs at 1.4 Ghz, no change from Volta nor from Turing probably. Let's see 35% more shaders than Volta but 60% more power and same clocks. It's not shaping up to be the "50% more efficient and 50% faster per SM" some hoped for.
 
Last edited:
Joined
Dec 22, 2011
Messages
3,131 (1.01/day)
System Name Zimmer Frame Rates
Processor Intel i7 920 @ Stock speeds baby
Motherboard EVGA X58 3X SLI
Cooling True 120
Memory Corsair Vengeance 12GB
Video Card(s) Palit GTX 980 Ti Super JetStream
Storage Of course
Display(s) Crossover 27Q 27" 2560x1440
Case Antec 1200
Audio Device(s) Don't be silly
Power Supply XFX 650W Core
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 10
Benchmark Scores Epic
Well they can scrap the FP64 performance that the 5700XT offers in FP32 for starters so that is a bonus, with the TU102 being 18.6 billion transistors I'd suggest they have wiggle room. Just a thought.
 
Joined
Oct 4, 2017
Messages
421 (0.43/day)
Location
France
System Name White Rose ( https://imgur.com/gallery/l7Lg4Wj )
Processor RYZEN 7 2700
Motherboard ROG STRIX B450-i
Cooling NOCTUA NH-L12S
Memory Patriot Viper Steel DDR4 4000Mhz 16Go PVS416G400C9K
Video Card(s) ASUS STRIX 1080Ti OC
Storage XPG SX8200 Pro 512 go NVMe + SAMSUNG 850 EVO 500GB
Display(s) SAMSUNG U28D590D 4K 28''
Case Nouvolo Steck
Power Supply CORSAIR SF600
Mouse Logitech G203 Prodigy
Keyboard Ajazz ak33
Software Windows 10 1909
Some of what your saying is wrong ,it takes up quite a lot of die space relatively hence Nvidia's large die sizes which are added to by the requirements of extra cache resources and hardware needed to keep the special units busy.
I'm afraid you are wrong . The myth that larger die sized are correlated to fixed function hardware has been already debunked , im trying to find the source , might be TPU , Anandtech , or Youtube but it might take time until i find it so i will link it here ASAP .

There is no real correlation between die size increase and fixed function as the latter eats relatively very low die space , more likely than not the higher die size in Turing is explained by the fact that it has more SMs .

This is further backed up by GA100 which has increased dies size compared to GV100 ( 826mm^2 vs 815mm^2 ) but significantly lower TensorCore count ( 432 vs 640 ) . So it is pretty obvious that fixed function hardware is not responsible for the die size expansion !

The other reason being because they can, and to make more money, it's not rocket science just business, people should have chosen with their wallet's.
This was exactly my point , the only tangible argument that justifies higher prices for Turing ( other than the increased silicon size ) is because the lack of competition allows them to do so .
 

M2B

Joined
Jun 2, 2017
Messages
212 (0.19/day)
Location
Iran
Processor Intel Core i5-8600K @4.9GHz
Motherboard MSI Z370 Gaming Pro Carbon
Cooling Cooler Master MasterLiquid ML240L RGB
Memory XPG 8GBx2 - 3200MHz CL16
Video Card(s) Asus Strix GTX 1080 OC Edition 8G 11Gbps
Storage 2x Samsung 850 EVO 1TB
Display(s) BenQ PD3200U
Case Thermaltake View 71 Tempered Glass RGB Edition
Power Supply EVGA 650 P2
. These aren't exactly Nvidia's brightest hours in terms of chip design
These are exactly Nvidia's brightest hours in terms of chip design.
The A100 packs 54 billion transistors, 2.5 times as much as a V100, and those transistors aren't there for nothing.
You can't just compare SM counts and base stupid assumptions upon that. The A100 is clearly a much more efficient solution for what it's been designed for.
 
Joined
Mar 10, 2010
Messages
7,611 (2.04/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R7 3800X@4.350/525/ Intel 8750H
Motherboard Crosshair hero7 @bios 2703/?
Cooling 360EK extreme rad+ 360$EK slim all push, cpu Monoblock Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in two sticks./16Gb
Video Card(s) Sapphire refference Rx vega 64 EK waterblocked/Rtx 2060
Storage Samsung Nvme Pg981, silicon power 1Tb samsung 840 basic as a primocache drive for, WD2Tbgrn +3Tbgrn,
Display(s) Samsung UAE28"850R 4k freesync, LG 49" 4K 60hz ,Oculus
Case Lianli p0-11 dynamic
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Iksu force fx
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
I'm afraid you are wrong . The myth that larger die sized are correlated to fixed function hardware has been already debunked , im trying to find the source , might be TPU , Anandtech , or Youtube but it might take time until i find it so i will link it here ASAP .

There is no real correlation between die size increase and fixed function as the latter eats relatively very low die space , more likely than not the higher die size in Turing is explained by the fact that it has more SMs .

This is further backed up by GA100 which has increased dies size compared to GV100 ( 826mm^2 vs 815mm^2 ) but significantly lower TensorCore count ( 432 vs 640 ) . So it is pretty obvious that fixed function hardware is not responsible for the die size expansion !



This was exactly my point , the only tangible argument that justifies higher prices for Turing ( other than the increased silicon size ) is because the lack of competition allows them to do so .
We disagree , so be it.
 
Joined
Jan 8, 2017
Messages
5,033 (4.05/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
These are exactly Nvidia's brightest hours in terms of chip design.
Why is almost 20% of the chip disabled then ? That's great design, right ?

You can't just compare SM counts and base stupid assumptions upon that.
Comparing SM counts and power is a totally legit way of inferring efficiency, how else would you do it? The SMs aren't same, but that's the point, efficiency wouldn't come just from the node.

those transistors aren't there for nothing.
Guess what buddy, some of them are for nothing, I'd say about 8-9 billion give or take.

Let's face reality, they couldn't enable the entire chip because of power constraints. Making a chip like that isn't desirable, it's painfully obvious they've missed their target by miles.
 
Last edited by a moderator:

M2B

Joined
Jun 2, 2017
Messages
212 (0.19/day)
Location
Iran
Processor Intel Core i5-8600K @4.9GHz
Motherboard MSI Z370 Gaming Pro Carbon
Cooling Cooler Master MasterLiquid ML240L RGB
Memory XPG 8GBx2 - 3200MHz CL16
Video Card(s) Asus Strix GTX 1080 OC Edition 8G 11Gbps
Storage 2x Samsung 850 EVO 1TB
Display(s) BenQ PD3200U
Case Thermaltake View 71 Tempered Glass RGB Edition
Power Supply EVGA 650 P2
Why is almost 20% of the chip disabled then ? That's great design, right ?



Comparing SM counts and power is a totally legit way of inferring efficiency, how else would you do it, smart ass ? The SMs aren't same, but that's the point, efficiency wouldn't come just from the node.



Guess what buddy, some of them are for nothing, I'd say about 9 billion give or take.
Look at this clueless person acting like he really knows how to design GPUs better than a 200$ billion company which have been designing GPUs for ages.
So, based on your logic the Vega 56 is a more efficient GPU than AMD's latest and greatest 5700 XT, because it has more TFLOPS and much more compute units, and consumes similar amounts of power, right?
Based on the density figures, I think Nvidia is using TSMC's high-density version of their 7nm node, not the high-performance one, and that was not the case with previous generations.
They could just use the normal high performance version and scale up the GV100 chip, but they clearly needed more density for their design goals.
What I'm saying is that you have to see how the chip performs in applications that actully matter and base efficiency figures upon that, not just some raw numbers.
 
Last edited:
Joined
Jan 8, 2017
Messages
5,033 (4.05/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
Look at this clueless person acting like he really knows how to design GPUs better a 200$ billion company which have been designing GPUs for ages.
So, based on your logic the Vega 56 is a more efficient GPU than AMD's latest and greatest 5700 XT, because it has more TFLOPS and much more compute units, and consumes similiar amounts of power,
:roll:

Because you do know how to design a GPU, right ? Sorry your GPU architect badge must have fallen off.

So, based on your logic the Vega 56 is a more efficient GPU than AMD's latest and greatest 5700 XT, because it has more TFLOPS and much more compute units, and consumes similiar amounts of power,
Nope, that's based on your logic. Your understanding of what I said was obviously severely limited.

First of all Vega 56 uses more power, and runs at lower clocks. A legendary GPU architect like yourself would know that a larger processor at lower clocks runs more efficiently because shaders scale relativity linearly with power whereas a change in clocks incurs a change in voltage which isn't linear. In other words if let's say we have a GPU with N/2 shaders at 2 Ghz it will generally consume more power than a GPU with N shaders at 1 Ghz.

Let's compile that with how Navi works : RX 5700XT runs at a considerably higher voltages and clocks and has way less shaders and yet it generates a similar amount of FP32 compute with less power. It's obviously way more efficient architecturally, but as I already mentioned I am sure a world renowned GPU architect as yourself knew all that.

On the other hand, Volta and Ampere run at pretty much the same frequency and likely similar voltages since TSMC's 7nm doesn't seem to change that in any significant manner (in fact all 7nm CPU/GPU up until know seem to run at the same or even higher voltages), GA100 has 20% more shaders compared to V100 but also consumes 60% more power. It doesn't take much to see that efficiency isn't that great. It's not that hard to infer these things, don't overestimate their complexity.

Yes, I am sure when you factor in Nvidia's novel floating point formats it looks great, but if you look just at FP32, it's doesn't look great. It's rather mediocre. Do you not find it strange that our boy Jensen never once mentioned FP32 performance ?

I never said I knew how to design it better, stop projecting made up staff onto me. I said it was obvious they failed to do what they originally set out to do, hence why a considerable porton of the chip is fused off. They've done it in the past too.
 
Last edited:
Joined
Oct 22, 2014
Messages
8,058 (3.93/day)
Location
Sunshine Coast
System Name Black Box
Processor AMD 3200G
Motherboard MSI X470 Gaming Plus
Cooling Stock
Memory Adata 8Gb 2133Mhz DDR4
Storage Kingston A2000 512Gb NVME
Display(s) AOC 22" Freesync 1m.s. 75Hz
Case Corsair 450D High Air Flow.
Audio Device(s) No need.
Power Supply FSP Aurum 650W
Mouse Yes
Keyboard Of course
Software W10 Pro 64 bit
By the way I've just noticed the power :), 400W, that's 150W over V100. Ouch, 7nm hasn't been kind, I was right that this is a power hungry monster.
Plot twist.
Jensen wasn't baking it in his oven, he used them to heat his oven.
 
Joined
Nov 23, 2010
Messages
149 (0.04/day)
I think this is exactly what data center customers want and have been asking for, these will sell like hot cakes to the big cloud operators.
 
Joined
Dec 18, 2015
Messages
93 (0.06/day)
System Name Avell old monster - Workstation T1 - HTPC
Processor i7-3630QM\i7-5960x\Ryzen 3 2200G
Cooling Stock.
Memory 2x4Gb @ 1600Mhz
Video Card(s) HD 7970M \ EVGA GTX 980\ Vega 8
Storage SSD Sandisk Ultra li - 480 GB + 1 TB 5400 RPM WD - 960gb SDD + 2TB HDD
I'm afraid you are wrong . The myth that larger die sized are correlated to fixed function hardware has been already debunked , im trying to find the source , might be TPU , Anandtech , or Youtube but it might take time until i find it so i will link it here ASAP .

There is no real correlation between die size increase and fixed function as the latter eats relatively very low die space , more likely than not the higher die size in Turing is explained by the fact that it has more SMs .

This is further backed up by GA100 which has increased dies size compared to GV100 ( 826mm^2 vs 815mm^2 ) but significantly lower TensorCore count ( 432 vs 640 ) . So it is pretty obvious that fixed function hardware is not responsible for the die size expansion !



This was exactly my point , the only tangible argument that justifies higher prices for Turing ( other than the increased silicon size ) is because the lack of competition allows them to do so .
How do these huge tensor cores do not take up space and increase the die size ? Maybe this will help to understand the relationship between die size, yields and GPU cost.

https://www.reddit.com/r/nvidia/comments/99r2x3
 
Joined
Mar 26, 2009
Messages
175 (0.04/day)
Very unimpressive FP32 and FP64 performance, I was way off in my estimations. Again, it's a case of optimizing for way too many things. So much silicon is dedicated to non traditional performance metrics that I wonder if it makes sense trying to shove everything in one package.
GA 100 is 20X faster than V100 in AI workloads and 2.5X in FP64 workloads, that's a generational leap like no other. This is an AI optimized chip, it has no RT cores, no encoders and no display connectors, it's focus is mainly on AI training and inference, for which it provides stellar performance that crushes any hope of competition in the near future. And you are comparing regular crap like FP32 and FP64?

Alright A100 provides 156 TF FP32 compared to only 15 TF in V100. That alone is 10X increase in FP32 compute power without the need to change any code. They can extend that lead to 20X through sparse network optimizations to 312 TF of FP32 without code change.

In FP16 the increase is also 2.5X in non optimzied code, and 6X in optimized code, same for INT8 and INT4 numbers, so A100 is really several orders of magnitude faster than V100 in any AI workload.

1589531819074.png


Also the 400w of power consumption is nothing relative to the size of this monster, you have 40GB of HBM2, loads of NVLink connections, loads of tensor cores that take up die area, heat and power, the chip is also cut down (which means lost power consumption), also the trend in data centers and AI is to open power consumption up to allow for more comfortable performance, V100 reached 350W in it's second iteration and 450W in its third iteration.

You seem to lack any ounce of data center experience, so I just suggest you stick to the of analysis consumer GPUs. This isn't your area.
 
Last edited:
Joined
Jan 8, 2017
Messages
5,033 (4.05/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
You seem to lack any ounce of data center experience, so I just suggest you stick to the of analysis consumer GPUs.
I'll stick with whatever the hell I want, thanks. You, copy pasting boiler plate from Nvidia's website can be considered anything but an "analysis". What are you, a sales man ? You're barking at the wrong tree buddy.

GA 100 is 20X faster than V100 in AI workloads and 2.5X in FP64 workloads
It turns out I overestimated your ability to copy paste information, you can't even do that :

d.png


9.7 / 7.8 = 1.24X (FP64)

Or maybe Jensen did a good job deceiving the less tech literate with their fine print by mixing together FP64 with FP64 TF.

Nice paint skills by the way.
 
Last edited:
Joined
Oct 15, 2010
Messages
170 (0.05/day)
Figuring out how they get 40 GB from 6 HBM stacks is a little confusing.
They cheaped out, instead of offering 6 modules of 8 gb, for a total of 48 gb, they wen for higher margins. Will offer a better improved version with full 48 gb memory, 25 mhz more, on core and memory, for 5000 dollar more. Dunn worry about it.

Is so typical of nvidia.

Figuring out how they get 40 GB from 6 HBM stacks is a little confusing.
They cheaped out, instead of offering 6 modules of 8 gb, for a total of 48 gb, they wen for higher margins. Will offer a better improved version with full 48 gb memory, 25 mhz more, on core and memory, for 5000 dollar more. Dunn worry about it.

Is so typical of nvidia.
 
Joined
Mar 26, 2009
Messages
175 (0.04/day)
9.7 / 7.8 = 1.24X (FP64)

Or maybe Jensen did a good job deceiving the less tech literate with their fine print.
Hey genius, I already provided you with a chart explaining all the metrics, good to know you can't read.

FP64 from Tensor cores is 19.5TF. Which is a 2.5X increase over V100. FP64 from CUDA cores is 9.7TF. If you can use both at the same time you will get about 30TF of FP64 for AI actually.

You, copy pasting boiler plate from Nvidia's website can be considered anything but an "analysis"
It's much more meaningful than the ignorant job you did, analysing regular FP32/FP64 in an AI GPU. Talk about an extreme case of stuff that are way over your head.
 
Joined
Jan 8, 2017
Messages
5,033 (4.05/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
Hey genius, I already provided you with a chart explaining all the metrics, good to know you can't read.

FP64 from Tensor cores is 19.5TF. Which is a 2.5X increase over V100. FP64 from CUDA cores is 9.7TF. If you can use both at the same time you will get about 30TF of FP64 for AI actually.
You're so cute when you try to explain your utter lack of understanding about these metrics.

You wrote "FP64 workloads", you genius. That's pure FP64 not tensor ops, you're clueless and stubborn.
 
Top