1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Tesla K10 GPU Hits New Performance Milestones For Scientific Simulation

Discussion in 'News' started by btarunr, Jun 18, 2012.

  1. btarunr

    btarunr Editor & Senior Moderator Staff Member

    Joined:
    Oct 9, 2007
    Messages:
    28,436 (11.28/day)
    Thanks Received:
    13,620
    Location:
    Hyderabad, India
    ISC'12 - NVIDIA Tesla K10 GPUs offer performance breakthroughs on popular high performance computing (HPC) applications -- ranging from seismic processing to life sciences to video processing -- according to new benchmarks NVIDIA released today.

    Based on the new NVIDIA Kepler computing architecture, the Tesla K10 GPU delivers the industry's highest single precision performance (4.58 teraflops) and highest memory bandwidth (320 GB/sec) in a single accelerator. This is 12 times higher single precision flops and 6.4 times higher memory bandwidth than the latest-generation Intel Sandy Bridge CPUs.

    [​IMG]

    The Tesla K10 GPU outperforms CPUs and previous-generation GPUs across the board on the most popular, compute-intensive applications for four key market segments, including:
    • Defense: video analytics, video stabilization, orthorectification, computer vision
    • Life and material sciences: molecular dynamics
    • Oil and gas: seismic processing, reverse time migration
    • Media and entertainment: video editing, video rendering/transcoding, ray tracing
    "A distinct advantage of the Tesla K10 GPUs is that it excels in two key areas that have a dramatic impact on overall application performance: floating point operation and memory bandwidth," said Sumit Gupta, senior director of Tesla business at NVIDIA. "Together, these enable the K10 GPU to deliver substantial out-of-the-box performance increases for the top science, engineering and commercial applications with little or no effort on the part of the developer."

    New Performance Records on AMBER and LAMMPS
    On AMBER, a leading biomolecular simulation software application, four Tesla K10 GPUs achieved world record performance, delivering far superior results than what was available on multiple racks of servers just a few years ago.

    The Tesla system achieved performance of 76 nanoseconds of computer simulation time in a day for a 23,558 atom molecule, outstripping the previous record set with four Tesla M2090s last year, providing supercomputing performance to thousands of individual researchers to fuel further innovation in such areas as new drug discovery and more effective materials.

    "In biomolecular science, adding a few more nanoseconds of simulation time can make a world of difference in the ability of researchers to study and better understand the behavior of complex biological systems," said Ross Walker, assistant research professor, San Diego Supercomputing Center. "It still blows my mind that a single Tesla K10 outperforms some of the largest CPU clusters. The benefit it offers researchers is tremendous, enabling them to accelerate the search for new and better treatments for a host of diseases and disorders."

    The Tesla K10 GPU also delivers the highest performance on LAMMPS, another application widely used by the life sciences research community. Running the LAMMPS Lennard Jones Liquid Benchmark, a single Tesla K10 GPU outperforms a Tesla M2090 GPU by 80 percent, delivering the equivalent performance of a cluster with 64 x86 CPUs.

    Accelerating the Search for Energy
    NVIDIA Tesla GPUs continue to deliver the highest performance on reverse time migration (RTM) applications for seismic processing in the oil and gas exploration industry, and for image processing in the computer vision industry. Petrobras, the national oil and gas company of Brazil, achieved an 1.8x speed up on its RTM application on the Tesla K10 GPU, as compared to a Tesla M2090 GPU within the same power envelope.

    NVIDIA Tesla K10 GPUs are available from leading OEMs, including Appro Supercomputer Solutions, Dell, HP, IBM, SGI and Supermicro, as well as through NVIDIA distribution partners. More information about the Tesla K10 is available on the NVIDIA Tesla website.
    Recus and BraveSoul say thanks.
  2. hardcore_gamer

    hardcore_gamer

    Joined:
    Jan 25, 2011
    Messages:
    380 (0.29/day)
    Thanks Received:
    170
    Location:
    Fabry Perot cavity,AlGaAs-GaAs Heterojunction
    I wonder why they didn't mention double precision performance.:rolleyes:
  3. renz496

    Joined:
    Mar 24, 2012
    Messages:
    86 (0.10/day)
    Thanks Received:
    7
    what for? anyone interested in this baby should have no use of DP anyway since the card was sold for its SP performance. but honestly i'm surprised when nvidia were using GK104 chip in their latest Tesla line up. why they didn't do so before with GF104/114 chips?
    1c3d0g says thanks.
  4. hardcore_gamer

    hardcore_gamer

    Joined:
    Jan 25, 2011
    Messages:
    380 (0.29/day)
    Thanks Received:
    170
    Location:
    Fabry Perot cavity,AlGaAs-GaAs Heterojunction
    Because DP performance is very important in HPC.


    They made a Tesla card out of GK 104 (in fact two GK104s) even though it sucks at computing because GK 110 ( originally meant to be the gtx 680) won't enter production any time soon.

    They didn't make computing cards using GF104/GF114 because GF100 and GF110 were kick ass cards with exceptional computing performance.
  5. renz496

    Joined:
    Mar 24, 2012
    Messages:
    86 (0.10/day)
    Thanks Received:
    7
    i know DP is important in HPC space. but as far as i know this product was aimed towards application that only utilize SP. personally i think nvdia don't want the HPC crowd to get upset with GK110 being late so they throw GK104 into tesla line up. they might be only good at SP and poor at DP but at least nvidia have to show something dont they? :p

    and because of this they can charge more for GK110 parts since it will be amazing in both SP and DP :D
    1c3d0g says thanks.
  6. MrMilli

    MrMilli

    Joined:
    Mar 1, 2008
    Messages:
    216 (0.09/day)
    Thanks Received:
    35
    Location:
    Antwerp, Belgium
    I don't know if you could call it amazing.
    GK110's DP performance should be higher than AMD's HD7970 (according to rumors) and the new HD7970 Ghz Edition will be pretty close.

    http://parallelis.com/k20-updated-kepler-architecture/
    http://www.brightsideofnews.com/new...ansistors2c-300w-tdp2c-384-bit-interface.aspx

    If LuxMark is any reference, then nVidia is in bad shape with Kepler.

    [​IMG]
  7. renz496

    Joined:
    Mar 24, 2012
    Messages:
    86 (0.10/day)
    Thanks Received:
    7
    lol. when i say 'amazing' i only mean how amazing GK110 will be compared to GK104. bad or not only time will tell.
  8. Steevo

    Steevo

    Joined:
    Nov 4, 2005
    Messages:
    8,233 (2.55/day)
    Thanks Received:
    1,152
    Bring deh AMMBER LAMPS.
    1freedude says thanks.
    10 Million points folded for TPU
  9. HillBeast

    HillBeast New Member

    Joined:
    Jan 16, 2010
    Messages:
    407 (0.24/day)
    Thanks Received:
    27
    Location:
    New Zealand
    I call bulls**t on the memory bandwidth being '6.4 times faster than Sandy Bridge'. Perhaps LGA1155, but not on quad-channel LGA2011 Sandy Bridge. Sure the numbers may stack up and say NVIDIA is faster, but in the planet known as Earth, these figures would be unattainable.
  10. blanarahul

    blanarahul

    Joined:
    Dec 17, 2011
    Messages:
    116 (0.12/day)
    Thanks Received:
    7
    Seeing how NVIDIA just doubled everything from GTX 560 Ti to GTX 680... I thought GTX 680 was the successor to the GTX 560 Ti. But looks like I am wrong. Since Gk110 is 2x gk104 with extra features and better fp64 performance. It can't be a successor to gf110.

    And GK110 won't make it to Consumer markets in any case.
    7.1 Billion transistor should mean a die size of roughly 600 mm square. This is extremely uneconomical for consumer markets.
    ................................................

    But Nvidia is in for a kick-ass competition.
    Intel Xeon Phi should be a larrabee core. With over 1 Teraflop of FP64 performance. It should kick K10's butt.
    Last edited: Jun 18, 2012
  11. Recus

    Recus

    Joined:
    Jul 10, 2011
    Messages:
    515 (0.45/day)
    Thanks Received:
    169
    [​IMG] TPU members. Begs Nvidia for computing, disappointed when gets it. :D
  12. theeldest

    theeldest

    Joined:
    Feb 7, 2006
    Messages:
    652 (0.21/day)
    Thanks Received:
    140
    Location:
    Austin, TX
    A little misleading as it's a dual GK104 solution.

    Edit: Also misleading to name it the K10 when it's based on a couple GK104 cores. This is NOT a GK110. IIRC, the GK110 Tesla device will be a K20.
  13. theoneandonlymrk

    theoneandonlymrk

    Joined:
    Mar 10, 2010
    Messages:
    3,377 (2.06/day)
    Thanks Received:
    563
    Location:
    Manchester uk
    thats generally what nvidia do, nameing shennanigins abound withem, and im surprised Amd havent countered with a 7870 or 7970 dual fire pro card or some such as it would demolish this as a single W600- 9000 come close
  14. eddman New Member

    Joined:
    Mar 9, 2012
    Messages:
    35 (0.04/day)
    Thanks Received:
    10
    If the rumors are true, then how is 1.5 TFLOPS of DP performance not amazing?!

    HD7970 Ghz Edition's DP number will be at 1.12 TFLOPS, which is still not that close, if they clock the firestream version the same of course, and not lower.

    Those numbers are for GK104, and as we all know its DP performance is only 1/24 of SP. That ratio for GK110 is 1/3, so those figures mean nothing for K20.
    Last edited: Jun 19, 2012
  15. largon New Member

    Joined:
    May 6, 2005
    Messages:
    2,778 (0.82/day)
    Thanks Received:
    432
    Location:
    Tre, Suomi Finland
    :laugh:
    Around 500mm² is more like it.
    Same ballpark as GF100.
    Sure - for average consumers. High-end cards are not for average consumers.
  16. Xzibit

    Joined:
    Apr 30, 2012
    Messages:
    1,121 (1.31/day)
    Thanks Received:
    252
    GeForce chips

    GT200 = 576mm2 / GTX 260 & 280

    GF100 = 529mm2 / GTX 465, 470 & 480

    GF110 = 520mm2 / GTX 560 Ti OEM, 560 TI 448, 570, 580 & 590

    They seam pretty comfortable releasing chips close to 600mm2.

    The problem would be how much more power it would use? AMD chip is 352mm2 and its not handy capped in computation power and is running neck and neck in power usage with the 294mm2 GK104 that is extremely hindered in that area by 2/3rds.
    Last edited: Jun 20, 2012
  17. MrMilli

    MrMilli

    Joined:
    Mar 1, 2008
    Messages:
    216 (0.09/day)
    Thanks Received:
    35
    Location:
    Antwerp, Belgium
    From what I've read on technical forums, a more realistic number seems to be 1.3 TFLOPS. But we'll see when the product eventually gets released. Why it's not amazing for me is that nVidia needs a 7 billion transistor chip to accomplish this feat. AMD could, with GCN, make a chip that's more powerful with a smaller size. I think that has always been the power of AMD, to make GPU's that perform close to nVidia's while being 50% or more smaller.
  18. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,196 (6.54/day)
    Thanks Received:
    2,031
    Location:
    Concord, NH
    Source? If the GK110 is still based on Kepler, I don't believe that the ratio will actually change, just the number of shaders and clocks will within the same architecture. I call shenanigans. :banghead:
  19. theeldest

    theeldest

    Joined:
    Feb 7, 2006
    Messages:
    652 (0.21/day)
    Thanks Received:
    140
    Location:
    Austin, TX
    Why?

    The ratio in Fermi was different. GF110/100 had a different ratio than GF114/104.
  20. blanarahul

    blanarahul

    Joined:
    Dec 17, 2011
    Messages:
    116 (0.12/day)
    Thanks Received:
    7
    GK104:- 3.54 billion transistors at 294 mm^2
    GK110:- 7.1 billion transistors

    3.54 billion * 2 = 7.08 billion
    294 mm^2 * 2 = 592 mm^2

    I hope you know enough math to understand a simple calculation.

    Btw there would be no use to release GK110 for consumers. Because gaming wise GTX 690 should roughly equal GK110. And since Gk110 is a much much larger die, it would consume a hell lot of power. Another factor is gonna be yields. Will TSMC cope with the pressure to produce enough 7.1 billion transistor chip? Seems unlikely till next year.

    And we all know that Maxwell is coming next year(if all goes well).
  21. blanarahul

    blanarahul

    Joined:
    Dec 17, 2011
    Messages:
    116 (0.12/day)
    Thanks Received:
    7
    Yields are gonna be another issue. Plus the GTX 690 should be enough to outperform gk110 in games.(1536*2 CUDA Cores vs. 2880 BUDA Cores)
  22. theoneandonlymrk

    theoneandonlymrk

    Joined:
    Mar 10, 2010
    Messages:
    3,377 (2.06/day)
    Thanks Received:
    563
    Location:
    Manchester uk
    the 7970 has these specs listed now, non Ghz edition

    3.79 TFLOPS Single Precision compute power
    947 GFLOPS Double Precision compute power

    the GK110 has a big ask ahead of it , if the GK104 has any input, you can double the shaders and add more Dp shaders all you like and thats still a big ask, Nvidia are using two GK104, for 4.5 Tflops SP compute power 2x 7970 would have <7.58Tflops SP compute power, in performance per watt a single 7970 must aniallate a K10 compute card.

    and by the look of things we wont see the GK110 till a year after amd released the 7970.
  23. eddman New Member

    Joined:
    Mar 9, 2012
    Messages:
    35 (0.04/day)
    Thanks Received:
    10
  24. largon New Member

    Joined:
    May 6, 2005
    Messages:
    2,778 (0.82/day)
    Thanks Received:
    432
    Location:
    Tre, Suomi Finland
    And I hope you would know estimating the die size is not that simple.
    [​IMG]
  25. MrMilli

    MrMilli

    Joined:
    Mar 1, 2008
    Messages:
    216 (0.09/day)
    Thanks Received:
    35
    Location:
    Antwerp, Belgium

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page