• We've upgraded our forums. Please post any issues/requests in this thread.

Tesla K20 GPU Compute Processor Specifications Released

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
34,335 (9.22/day)
Likes
17,427
Location
Hyderabad, India
System Name Long shelf-life potato
Processor Intel Core i7-4770K
Motherboard ASUS Z97-A
Cooling Xigmatek Aegir CPU Cooler
Memory 16GB Kingston HyperX Beast DDR3-1866
Video Card(s) 2x GeForce GTX 970 SLI
Storage ADATA SU800 512GB
Display(s) Samsung U28D590D 28-inch 4K
Case Cooler Master CM690 Window
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Corsair HX850W
Mouse Razer Abyssus 2014
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro Creators Update
#1
Specifications of NVIDIA's Tesla K20 GPU compute processor, which was launched way back in May, are finally disclosed. We've known since then that the K20 is based on NVIDIA's large GK110 GPU, a chip never used to power a GeForce graphics card, yet. Apparently, NVIDIA is leaving some room on the silicon that allows it to harvest it better. According to a specifications sheet compiled by Heise.de, Tesla K20 will feature 13 SMX units, compared to the 15 available on the GK110 silicon.

Show full news post
 
Last edited:
Joined
Sep 15, 2007
Messages
2,738 (0.73/day)
Likes
861
Location
Police/Nanny State of America
System Name More hardware than I use :|
Processor 4.7 8350 - 4.2 4560K - 4.4 4690K
Motherboard Sabertooth R2.0 - Gigabyte Z87X-UD4H-CF - AsRock Z97M KIller
Cooling Mugen 2 rev B push/pull - Hyper 212+ push/pull - Hyper 212+
Memory 16GB Gskill - 8GB Gskill - 16GB Ballistix 1.35v
Video Card(s) Xfire OCed 7950s - Powercolor 290x - Oced Zotac 980Ti AMP! (also have two 7870s)
Storage Crucial 250GB SSD, Kingston 3K 120GB, Sammy 1TB, various WDs, 13TB (actual capactity) NAS with WDs
Display(s) X-star 27" 1440 - Auria 27" 1440 - BenQ 24" 1080 - Acer 23" 1080
Case Lian Li open bench - Fractal Design ARC - Thermaltake Cube (still have HAF 932 and more ARCs)
Audio Device(s) Titanium HD - Onkyo HT-RC360 Receiver - BIC America custom 5.1 set up (and extra Klipsch sub)
Power Supply Corsair 850W V2 - EVGA 1000 G2 - Seasonic 500 and 600W units (dead 750W needs RMA lol)
Mouse Logitech G5 - Sentey Revolution Pro - Sentey Lumenata Pro - multiple wireless logitechs
Keyboard Logitech G11s - Thermaltake Challenger
Software I wish I could kill myself instead of using windows (OSX can suck it too).
#2
So, buy 5870s. Got it :p
 
Joined
Sep 7, 2011
Messages
2,785 (1.21/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#4
Seems like a repeat of GF100/110. Hardly surprising if the die is 500mm^2+

The first Fermi Tesla's (M2050/M2070) out of the gate were basically GTX 470 spec. M2090 released more recently is pretty much a GTX 580.

Would be interesting to know whether these Tesla's are the same SKU's that ORNL are taking delivery of, or whether they are higher spec since Oak Ridge seemed to be the high profile launch customer.
in other words it can almost match tahiti
Any comparison probably depends on actual performance efficiency rather than hypothetical. Unless you know what K20 brings to the table, a theoretical comparison is largely useless.

BTW: The original site now no longer features any specification
 
Last edited:

Solaris17

Creator Solaris Utility DVD
Joined
Aug 16, 2005
Messages
19,273 (4.28/day)
Likes
6,081
Location
Florida
System Name Not named yet
Processor I5 7640x 5Ghz 24/7
Motherboard MSI x299 Tomahawk Arctic
Cooling Corsair H55
Memory 32GB Corsair DDR4 3000mhz
Video Card(s) Gigabyte 1080TI
Storage 2x Seagate 3TB Drives (RAID 0) 1x Seagate 256GB SSD 1x Adata 120GB SSD
Display(s) 3x AOC Q2577PWQ
Case Inwin 303 White (Thermaltake Ring 120mm Purple accent)
Audio Device(s) Onboard on Audio-Technica ATH-AG1
Power Supply Seasonic 1050W Snow
Mouse Roccat Tyon White
Keyboard Ducky Shine 6
Software Windows 10 x64 Pro
#5
those cores.....my god.
 
Joined
Nov 13, 2009
Messages
5,614 (1.90/day)
Likes
1,678
Location
San Diego, CA
System Name White Boy
Processor Core i7 3770k @4.6 Ghz
Motherboard ASUS P8Z77-I Deluxe
Cooling CORSAIR H100
Memory CORSAIR Vengeance 16GB @ 2177
Video Card(s) EVGA GTX 680 CLASSIEFIED @ 1250 Core
Storage 2 Samsung 830 256 GB (Raid 0) 1 Hitachi 4 TB
Display(s) 1 Dell 30U11 30"
Case BIT FENIX Prodigy
Audio Device(s) none
Power Supply SeaSonic X750 Gold 750W Modular
Software Windows Pro 7 64 bit || Ubuntu 64 Bit
Benchmark Scores 2017 Unigine Heaven :: P37239 3D Mark Vantage
#6
do want
 
Joined
Jan 15, 2012
Messages
679 (0.31/day)
Likes
58
Location
Slovenia
System Name PC.
Processor i7 2600K 5.0Gh,i7 3770K 5.00Gh. EK, Liqed Coooleng
Motherboard P67A-UD7-B3 Gigabyte T.,ASUS,P8Z77-V PREMIUM,MAXIMUS V EXTRIME..
Cooling Liqed Cooleng ,EK Suprime LTX Nickel,EK for Motherboard,Aqua computer (WGA), Thermaltake .... 0i,
Memory G.SKILL F3-17600CL7-2GBPISG. 16GBSkill Sniper F3-17000CL94GBSR on 2400Hz 10-12-11-29 1
Video Card(s) GTX590 ,SLI ,POV TGT best 691Hz ,LiqedCoold,GTX480.....GTX1080MSI SeaHawkEK SLI
Storage OCZ-REVODRIVE 3-240GB,2xCrucialMX100.512.R-0,1x LMT-32L3m,3x 1TB-WD,1x;1x2TbSEAGATE1x2Tb Seagate
Display(s) DELL-U2412Mb,Samsung Synkmaster245B,HP ENVY 34c
Case Thermaltake, NZXT SWITCH 810SE
Audio Device(s) CREATIVE BLASTER X-Fi Titanium HD , AUNE T1MK2 TUBE USB
Power Supply ENERMAX Platimax 1500W,Thermaltake 1500W
Mouse VIPER V560,FUNC MS-3, Prestigio, R.A.T.E.7 and 5,LogitechG502,RAZER,Inperator.,dead...a.s.o.
Keyboard Trust ....LogotechG410
Software Windows7 64....
Benchmark Scores 3DMark Fire Strike 21.385 (37.234,11.828,7.176)
#7
Estimated 20 PFOPS/s peak petaflops .!!!:eek::twitch: and3.52 TFLOP/s normal. D.P.1.17 TFLOPS/s.
Nice peak.
I wish 20 PFOPS/s on next GPU option.:D
 
Joined
Dec 16, 2010
Messages
1,484 (0.58/day)
Likes
544
System Name My Surround PC
Processor Intel Core i7 4770K @ 4.2 GHz (1.15 V)
Motherboard ASRock Z87 Extreme6
Cooling Swiftech MCP35X / XSPC Rasa CPU / Swiftech MCW82 / Koolance HX-1320 w/ 8 Scythe Fans
Memory 16GB (2 x 8 GB) Mushkin Blackline DDR3-2400 CL11-13-13-31
Video Card(s) MSI Nvidia GeForce GTX 980 Ti Armor 2X
Storage Samsung SSD 850 Pro 256GB, 2 x 4TB HGST NAS HDD in RAID 1
Display(s) 3 x Acer K272HUL 27" in Surround 7860x1440
Case NZXT Source 530
Audio Device(s) Integrated ALC1150 + Logitech Z-5500 5.1
Power Supply Seasonic X-1250 1.25kW
Mouse Gigabyte Aivia Krypton
Keyboard Logitech G15
Software Windows 8.1 Pro x64
#8
5GB of memory? That's not evenly divisible by the 384-bit memory bus it was rumored to have. Has it been reduced to 320-bit, which could produce an even 5GB?
 
Joined
Sep 7, 2011
Messages
2,785 (1.21/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#9

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
34,335 (9.22/day)
Likes
17,427
Location
Hyderabad, India
System Name Long shelf-life potato
Processor Intel Core i7-4770K
Motherboard ASUS Z97-A
Cooling Xigmatek Aegir CPU Cooler
Memory 16GB Kingston HyperX Beast DDR3-1866
Video Card(s) 2x GeForce GTX 970 SLI
Storage ADATA SU800 512GB
Display(s) Samsung U28D590D 28-inch 4K
Case Cooler Master CM690 Window
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Corsair HX850W
Mouse Razer Abyssus 2014
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro Creators Update
#10
5GB of memory? That's not evenly divisible by the 384-bit memory bus it was rumored to have. Has it been reduced to 320-bit, which could produce an even 5GB?
Mix matching. Just like 2 GB is made possible on 192-bit.
 
Joined
Sep 15, 2011
Messages
4,358 (1.91/day)
Likes
1,074
Processor Intel Core i7 3770k @ 4.3GHz
Motherboard Asus P8Z77-V LK
Memory 16GB(2x8) DDR3@2133MHz 1.5v Patriot
Video Card(s) MSI GeForce GTX 1080 GAMING X 8G
Storage 59.63GB Samsung SSD 830 + 465.76 GB Samsung SSD 840 EVO + 2TB Hitachi + 300GB Velociraptor HDD
Display(s) Acer Predator X34 3440x1440@100Hz G-Sync
Case NZXT PHANTOM410-BK
Audio Device(s) Creative X-Fi Titanium PCIe
Power Supply Corsair 850W
Mouse Anker
Software Win 10 Pro - 64bit
Benchmark Scores 30FPS in NFS:Rivals
#11
LOL. 7 billion transistors! I remember that my old 3dfx VooDoo 3 was having 7 million transistors and was the fastest when released. :))))
 
Joined
Dec 16, 2010
Messages
1,484 (0.58/day)
Likes
544
System Name My Surround PC
Processor Intel Core i7 4770K @ 4.2 GHz (1.15 V)
Motherboard ASRock Z87 Extreme6
Cooling Swiftech MCP35X / XSPC Rasa CPU / Swiftech MCW82 / Koolance HX-1320 w/ 8 Scythe Fans
Memory 16GB (2 x 8 GB) Mushkin Blackline DDR3-2400 CL11-13-13-31
Video Card(s) MSI Nvidia GeForce GTX 980 Ti Armor 2X
Storage Samsung SSD 850 Pro 256GB, 2 x 4TB HGST NAS HDD in RAID 1
Display(s) 3 x Acer K272HUL 27" in Surround 7860x1440
Case NZXT Source 530
Audio Device(s) Integrated ALC1150 + Logitech Z-5500 5.1
Power Supply Seasonic X-1250 1.25kW
Mouse Gigabyte Aivia Krypton
Keyboard Logitech G15
Software Windows 8.1 Pro x64
#12
Mix matching. Just like 2 GB is made possible on 192-bit.
True, that is possible. But would it really be done on a high-end compute card where consistent and predictable performance is important? It would be a headache for developers to have to track which addresses they write and determine which data should go in the more or less interleaved parts of the memory space.
 
Joined
Mar 6, 2008
Messages
2,700 (0.76/day)
Likes
1,364
Location
Minnesota
System Name I Dub Thee Infinity
Processor Intel Core I7-3930K
Motherboard EVGA X79 Classified
Cooling Corsair H80
Memory 16GB GSkill Trident X
Video Card(s) EVGA GTX 980 Ti SC+
Storage SanDisk Ultra Plus 256GB, OCZ V2 180GB, 2x Toshiba X300 5TB RAID 0
Display(s) Acer XB270HU
Case Cooler Master HAF X
Audio Device(s) Creative X-Fi Titanium + Sennheiser HD 598 + Klipsch ProMedia 2.1
Power Supply EVGA 850W G2
Mouse Razer Naga 2014
Keyboard Gigabyte Osmium Cherry MX Brown
Software Windows 10 Pro x64
#13
It's probably twenty 256MB chips on a 320-bit bus.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
34,335 (9.22/day)
Likes
17,427
Location
Hyderabad, India
System Name Long shelf-life potato
Processor Intel Core i7-4770K
Motherboard ASUS Z97-A
Cooling Xigmatek Aegir CPU Cooler
Memory 16GB Kingston HyperX Beast DDR3-1866
Video Card(s) 2x GeForce GTX 970 SLI
Storage ADATA SU800 512GB
Display(s) Samsung U28D590D 28-inch 4K
Case Cooler Master CM690 Window
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Corsair HX850W
Mouse Razer Abyssus 2014
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro Creators Update
#14
True, that is possible. But would it really be done on a high-end compute card where consistent and predictable performance is important? It would be a headache for developers to have to track which addresses they write and determine which data should go in the more or less interleaved parts of the memory space.
Low level video memory management is handled by API>CUDA>driver. Apps are oblivious to that. Apps are only told that there's 5 GB of memory, and to deal with it.
 
Joined
May 6, 2005
Messages
2,786 (0.60/day)
Likes
435
Location
Tre, Suomi Finland
Processor i7 4770K Haswell, watercooled
Motherboard Asus Z87-C2 Maximus VI Formula
Cooling Fuzion V1, MCW60/R2, DDC1/DDCT-01s top, PA120.3, EK200, 3× D12SL-12, liquid metal TIM
Memory 2× 8GB Crucial Ballistix Tactical LP DDR3-1600
Video Card(s) between GPUs
Storage Samsung 840Pro 256@178GB + 4× WD Red 2TB in RAID10 + LaCie Blade Runner 4TB
Display(s) HP ZR30w 30" 2560×1600 (WQXGA) H2-IPS
Case Lian Li PC-A16B
Audio Device(s) Onboard
Power Supply Corsair AX860i
Mouse Logitech PM MX / Contour RollerMouse Red+
Keyboard Logitech diNovo Edge / Logitech Elite Keyboard from 2006
Software W10 x64
Benchmark Scores yes
#15
That die shot definitely has 384bits worth of memory bus...
 

T4C Fantasy

CPU & GPU DB Maintainer
Joined
May 7, 2012
Messages
1,233 (0.60/day)
Likes
608
Location
Rhode Island
System Name Phantom 820 v3.1
Processor Intel Core i7-6700k @ 4.4GHz
Motherboard ASRock Z170 Formula OC (Bios: 7.40)
Cooling Corsair H115i
Memory Corsair Dominator Platinum 16GB DDR4 3000MHz
Video Card(s) ZOTAC GTX 1070 AMP! / EVGA GTX 1080 Ti SC2
Storage 512GB Crucial MX300 / 256GB OCZ Vertex 4 / 1TB Hitachi HDD
Display(s) 25" ASUS VX248 / 24'' LG DM2350D / 24" LG 24UD58-B 4K
Case NZXT Phantom 820 Ultra+ Tower
Audio Device(s) Logitech G933 Headset
Power Supply SeaSonic Platinum 1050W Snow
Mouse Logitech G900
Keyboard Logitech G910 Orion Spark
Software Windows 10 Pro Build 1703 64-bit
Benchmark Scores Folding PPD: 45,000~ / WCG PPD: 50,000~ (OLD) with HD 7970
Joined
Apr 30, 2012
Messages
2,419 (1.18/day)
Likes
1,333
#17
Any comparison probably depends on actual performance efficiency rather than hypothetical. Unless you know what K20 brings to the table, a theoretical comparison is largely useless.
Incase you didnt know Mark Harris points out he works for Nvidia.

So you might want to check who runs the sites your linking to if you want to link to un-bias information.

It be like linking to sites/blog run by AMD employees to make a point or further a view point of a AMD product.

Just silly.
 
Joined
Sep 7, 2011
Messages
2,785 (1.21/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#18

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
16,546 (3.87/day)
Likes
10,908
Location
Parkland County, Alberta
System Name Gamer
Processor Intel i7-6700K (ES)
Motherboard MSI Aegis TI
Cooling Custom Dragon Cooler
Memory 16 GB Kingston HyperX 2133 MHz C13
Video Card(s) 2x MSI GAMING GTX 980
Storage 2x Intel 600P
Display(s) Dell 3008WFP
Case MSI Aegis Ti
Mouse MSI Interceptor DS B1
Keyboard MSI DS4200 GAMING Keyboard
Software Windows 10 Home
#19
woah, how'd i miss this. Thanks for bumping, Smoke!

:roll:
 
Joined
Apr 30, 2012
Messages
2,419 (1.18/day)
Likes
1,333
#20
The report is a scientific paper published by the University of Aizu. It has nothing to do with Nvidia. Take your useless trolling elsewhere
Talk about idiot fanboyism.

That site is run by Mark Harris a Nvidia employee. Are you so naive that hes gonna post un-bias research link on his site/blog.
Nvidia would find a way to fire him in a second if he posted links to research papers that put Nvidia in a bad light.

It only took me 1 mouse click to findout he was a Nvidia employee. Come-on now. Whos trolling now ?

Atleast show both sides or attempt to so you wont seam like a Nvidia cheerleader

The performance of DGEMM in Fermi using this algorithm is
shown in Figure 3, along with the DGEMM performance from CUBLAS 3.1.
Note that the theoretical peak of the Fermi, in this case a C2050, is 515 GFlop/s
in double precision (448 cores 1:15 GHz 1 instruction per cycle). The kernel
described achieves up to 58% of that peak.
Thats from a Oak Ridge National Labaratory along with University of Tennesse and University of Manchester in UK study.

58% is lower then 90% in DGEMM. Maybe Kepler GK100/110 has a 34% jump who knows but chip on the GTX 280 was only 34% in DGEMM.

What do i know tho. I would think Oak Ridge National Labaratory does since they use the darn things.;)
 
Last edited:
Joined
Sep 7, 2011
Messages
2,785 (1.21/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#21
Talk about idiot fanboyism.
Sure - I'll use your quotes (and mine since you obviously can't RTFP) as examples
Thats from a Oak Ridge National Labaratory along with...
Yup. Which just goes to prove that real-world and theoretical numbers differ. Which is exactly as I noted. Likewise I made no assumption based upon a part whose performance is unknown...or do you have access to Kepler information that everyone outside of Nvidia and HPC projects don't?
Unless you know what K20 brings to the table, a theoretical comparison is largely useless.
So what is the DGEMM efficiency of Kepler ?
All I see here is a brief synopsis of Fermi
And of course, at no point did I make an AMD vs Nvidia comparison- quite the opposite in fact
Any comparison probably depends on actual performance efficiency rather than hypothetical
Get back under your bridge Xzibitroll - I'm sick of having to explain simple compound sentences to you.
 

T4C Fantasy

CPU & GPU DB Maintainer
Joined
May 7, 2012
Messages
1,233 (0.60/day)
Likes
608
Location
Rhode Island
System Name Phantom 820 v3.1
Processor Intel Core i7-6700k @ 4.4GHz
Motherboard ASRock Z170 Formula OC (Bios: 7.40)
Cooling Corsair H115i
Memory Corsair Dominator Platinum 16GB DDR4 3000MHz
Video Card(s) ZOTAC GTX 1070 AMP! / EVGA GTX 1080 Ti SC2
Storage 512GB Crucial MX300 / 256GB OCZ Vertex 4 / 1TB Hitachi HDD
Display(s) 25" ASUS VX248 / 24'' LG DM2350D / 24" LG 24UD58-B 4K
Case NZXT Phantom 820 Ultra+ Tower
Audio Device(s) Logitech G933 Headset
Power Supply SeaSonic Platinum 1050W Snow
Mouse Logitech G900
Keyboard Logitech G910 Orion Spark
Software Windows 10 Pro Build 1703 64-bit
Benchmark Scores Folding PPD: 45,000~ / WCG PPD: 50,000~ (OLD) with HD 7970
#22
Talk about idiot fanboyism.

That site is run by Mark Harris a Nvidia employee. Are you so naive that hes gonna post un-bias research link on his site/blog.
Nvidia would find a way to fire him in a second if he posted links to research papers that put Nvidia in a bad light.

It only took me 1 mouse click to findout he was a Nvidia employee. Come-on now. Whos trolling now ?

Atleast show both sides or attempt to so you wont seam like a Nvidia cheerleader



Thats from a Oak Ridge National Labaratory along with University of Tennesse and University of Manchester in UK study.

58% is lower then 90% in DGEMM. Maybe Kepler GK100/110 has a 34% jump who knows but chip on the GTX 280 was only 34% in DGEMM.

What do i know tho. I would think Oak Ridge National Labaratory does since they use the darn things.;)
http://www.techpowerup.com/gpudb/923/NVIDIA_Tesla_C2050.html

previous gen NVidia architecture calculates floating points by shader clock so the C2050 would be 1Tflop of single precision
 
Joined
Apr 30, 2012
Messages
2,419 (1.18/day)
Likes
1,333
#23
http://www.techpowerup.com/gpudb/923/NVIDIA_Tesla_C2050.html

previous gen NVidia architecture calculates floating points by shader clock so the C2050 would be 1Tflop of single precision
Those test are done in Double-percision. For single-percision it would be SGEMM.
C2050 is 515 GFlop/s in double precision so its only 58% as advertised.

Kepler would have to make up alot of ground in effeciency.

The point i was try'n to make was..

Pointing to a 90% effeciency of Tahiti in DGEMM as if its a bad thing, Especially from a site/blog of a Nvidia employee.
As compared to what ? Nvidias Fermi 58% effeciency in DGEMM ? That Nvidia employee doesnt have a link to that on his site. Wonder why ?
Even if Tahiti ran 58% it still be twice as fast in DGEMM compared to Fermi.

Given K20 is similar spec to W9000 and W8000 It would have to bring its efficiency up in such a comparison.
Maybe the K20 has better effeciency but when someone says hey look AMD can only do 90% when they fail to mention Nvidia only does 58% thats kind cheerleading to me.

We need to see Keplers DGEMM effeciency to see what % it is to its specs/as advertised.

:toast:

Update:
Nvidias marketing slides put DGEMM efficiency of K20 at 80% and Fermi at 60-65%. So if Oak Ridge National Laboratories put it 2% shy of 60% I would say the window would be 78-80% efficiency for K20. So we are more then likely going to see a draw between K20 & W9000 in DGEMM if the marketing slides of 80% effeciency are met.
 
Last edited:
Joined
Sep 7, 2011
Messages
2,785 (1.21/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#24
Update:
Nvidias marketing slides put DGEMM efficiency of K20 at 80% and Fermi at 60-65%.
As per usual the troll can't even parse a sentence without altering the content to suit its needs:
Kepler GK110 will provide over 1 TFlop of double precision throughput with greater than 80% DGEMM efficiency
Nvidia whitepaper May 2012. (pdf)
Still, coming from someone who openly admits to lying, and up until recently didn't even know the difference between a 3D rendering card and a math co-processor, it's hardly surprising.
I lied i just wanted to
Keep up with the straw man AMD vs Nvidia bullshit and the hypothetical numbers game. I'll stand by my preference for real world testing*
Any comparison probably depends on actual performance efficiency rather than hypothetical. Unless you know what K20 brings to the table, a theoretical comparison is largely useless.
*By your reasoning the AMD FirePro W9000 (3.99 TF SP, 1 TF DP) should be four times faster than a Quadro 6000 (1 TF SP, 515 GF DP)...after all, numbers don't lie right?
No...
No...
No
 
Last edited:
Joined
Apr 30, 2012
Messages
2,419 (1.18/day)
Likes
1,333
#25