1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

Discussion in 'News' started by btarunr, Feb 10, 2012.

  1. jamsbong New Member

    Joined:
    Mar 17, 2010
    Messages:
    83 (0.05/day)
    Thanks Received:
    7
    @Benetanegia I could continue this pointless argument with an NV fanboy such as pointing all the mistakes that you've made on the last post alone but it is time to move on.

    If NV have created something fantastic (i.e. 50% faster than GTX580 card) and it is stable enough to work on non-TWIMTBP titles. I won't mind cashing one for myself. If not, then Tahiti. A simple wait and see situation. Cheers.
  2. xenocide

    xenocide

    Joined:
    Mar 24, 2011
    Messages:
    2,124 (1.76/day)
    Thanks Received:
    458
    I actually explicitly said not counting cards like the GTX285 and 8800 Ultra because they technically came out after the initial lineup launched. They were usually just super high-end offerings that were made to address performance deficits or because they could. In the case of the GTX580 3GB, it was because super high-end users needed more VRAM, this only really affected people using 3 Display setups, so it was an incredibly niche product.

    If we wanted to go crazy there are all sorts of products released that are technically better, the HD5970 is to this day ridiculously powerful, and surprisingly cost efficient. I also omitted the HD4890, because it was launched months after the rest of the 4xxx series.

    My listings are still accurate. There are outliers, but for the most part all of those cards were the original high-end GPU of their corresponding series.
  3. crazyeyesreaper

    crazyeyesreaper Chief Broken Rig

    Joined:
    Mar 25, 2009
    Messages:
    8,126 (4.20/day)
    Thanks Received:
    2,742
    Location:
    04578
    dosent change the fact 680 will be priced at $600 + most likely in the $650-675 range with after market cooled cards hitting $700,

    but your free to believe what you wish, :roll:
  4. Crap Daddy

    Crap Daddy

    Joined:
    Oct 29, 2010
    Messages:
    2,739 (2.03/day)
    Thanks Received:
    1,044
    What you call "680" at 600$ + will probably get another name. All that we see now is the GK104 which will probably be faster by a hair than the 7970 (but enough to claim its the fastest card) with some disadvantages (lower mem bandwidth and probably already very high clocked at stock to meet the target of being faster than the 7970) and some say this will be the 680. Now this card will not cost 600$ but neither 300$ as it was reported so I would expect somewhere between 450-500. As it was reported, same chip with some disabled stuff and proly clocked lower will make the 670 part, perf between 580/7950 and 7970 for 350-400$. The big boy will be out later and there we can expect 600$ plus.
  5. radrok

    radrok

    Joined:
    Oct 26, 2011
    Messages:
    2,980 (3.01/day)
    Thanks Received:
    798
    Location:
    Italy
    Here comes to my mind EVGA Hydro Copper :laugh:
  6. m1dg3t

    m1dg3t

    Joined:
    May 22, 2010
    Messages:
    2,246 (1.49/day)
    Thanks Received:
    513
    Location:
    Canada
    Hell the 580 is still more expensive than the 7970! At most place's by me anyways :eek: Can't wait to see the new pricing
  7. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.52/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Giving up on time is good practice when you are so wrong, so well played. lol
  8. user21

    user21 New Member

    Joined:
    Jun 10, 2011
    Messages:
    282 (0.25/day)
    Thanks Received:
    20
    Location:
    Peshawar
    Time to kick back :p
  9. ViperXTR

    ViperXTR

    Joined:
    Jan 31, 2011
    Messages:
    1,394 (1.11/day)
    Thanks Received:
    407
    FXAA is already inside the recent nvidia drivers

    1. download nvidia inspector
    2. open the advanced driver settings
    3. look at the advanced configs (scroll down)
    4. set FXAA to 1 (default 0/off)

    there are also some hidden settings there like framecap/framerate limit, SLI and/or AA flags etc.

    also, some moar rumour tablez
    [​IMG]

    http://forum.beyond3d.com/showthread.php?p=1619912
  10. xenocide

    xenocide

    Joined:
    Mar 24, 2011
    Messages:
    2,124 (1.76/day)
    Thanks Received:
    458
    Interesting chart. I wonder why the AA never gets put above 4x...
  11. crazyeyesreaper

    crazyeyesreaper Chief Broken Rig

    Joined:
    Mar 25, 2009
    Messages:
    8,126 (4.20/day)
    Thanks Received:
    2,742
    Location:
    04578
    so according that chart.... 3Dmark 11 is 7% difference :roll:


    Total Average = is 12% difference across all those tests
    Last edited: Feb 13, 2012
  12. CrAsHnBuRnXp

    CrAsHnBuRnXp

    Joined:
    Oct 19, 2007
    Messages:
    5,451 (2.22/day)
    Thanks Received:
    635
    I just want benchmarks already so i know what to buy.
  13. Recus

    Recus

    Joined:
    Jul 10, 2011
    Messages:
    497 (0.45/day)
    Thanks Received:
    165
    [​IMG]

    Borderlands 2 or new Brothers in Arms running on Kepler? : D
  14. Crap Daddy

    Crap Daddy

    Joined:
    Oct 29, 2010
    Messages:
    2,739 (2.03/day)
    Thanks Received:
    1,044
    Aliens: Colonial Marines? PhysX? GTX680?

    As for that suspicious table, based on the specs which I think we can agree that are more or less accurate, this table was done by somebody who has done his homework. 30% plus on average above the GTX580 which brings us to that 10% over the 7970. If you look carefully you'll see the clocks - 1050 and 1425 - very high for a stock card and above the reported 950 for GPU. It is also done at 1080p where the mem bandwidth disadvantage is less pronounced.

    So what I'm saying is that if this is close to real then NV will launch the GK104 under the name GTX680, a slightly faster card than the 7970 with certain weak points due to the fact that the chip was initially designed for the performance segment but after AMD's launch it can fulfill other expectations. Price? Neither 300$ nor 550$
    Last edited: Feb 13, 2012
  15. sergionography

    Joined:
    Feb 13, 2012
    Messages:
    264 (0.30/day)
    Thanks Received:
    33
    I doubt these rumors are true, i heard about nvidia dropping their hot clocks but changing the structure of the gpu this much i dont think its possible in such a short amount of time, as far as i thought kepler is a fermi die shrink with some tweaks.
    and another note is that this article claims gk104 is a 340mm die which is nvidias mid range, the hd 7970 has a die size of 375mm, so much for the "we expected more from amd" talk
    not to mention nvidias high end is said to have a 550mm die size, well amd could easily build a gpu that big and pack more transistors but that is usualy a very bad business choice, and nvidia suffer from it almost every time.
  16. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.52/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    AMD/Nvidia do not start working on their chips only after releasing the previous one. They work for years on every chip. Sometimes as much as 5 years depending on how different it is. Nvidia is already working on Maxwell and whatever comes next. AMD is already working on their next 2 architectures too, Sea Islands and Canary Islands. The work on Kepler started many years ago, maybe even before GTX200 was released or shortly after.

    As far as Kepler goes, yes it's a tweaked Fermi in 99% of cases, you can see it in the specs and schematics. The only difference is that they dropped the hot-clocks, which makes SPs substantially smaller and doubled the amount of them per SM to compensate.

    No one knows exactly how much smaller SPs are, but just as an example of how much clocks can affect the size of some units, AMD Bart's memory controler is half as big as Cypress/Cayman because it's designed to work at ~1000 Mhz instead of >1200 Mhz. Those extra 200 Mhz make the memory controler in Cypress/Cayman twice as big. So in case of Kepler and looking at specs and 340 mm2, we can assume that non hot-clocked SPs are around half the size.
    xenocide says thanks.
  17. sergionography

    Joined:
    Feb 13, 2012
    Messages:
    264 (0.30/day)
    Thanks Received:
    33
    yes but fermi was supposed 2 be nvidias architecture for the years to come, kepler is a descendant kinda like piledriver will be for bulldozer.
    but well i guess that makes sense doing so in order to scale at high clocks kinda like cpus having longer pipelines to scale at high frequency but there is no way it will make that much difference(especially since the whole point of architecture that aim for high frequency is to make smaller chips with less hardware and lower ipc but with more throughput, but thats in cpus im not sure about gpus), mayb the 1536 is refering to the bigger gtx680/780 which would have a 550mm2 die size(read that in previous leaks/rumors)
    because even considering the die size which is much smaller than the 580 yet it triples the core count
    even with 28nm thats only 40% smaller and its near impossible to get perfect scaling
  18. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.52/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Don't let the number of SPs blind you, they didn't really tripple the number of cores. Like I said dropping the hot-clocks probably allows them to put 2x as many as if they were Fermi cores in the same space, but they are only half as fast. They are trading 2x shader clock for 2x the number of SP.

    Based on die area GK104 has to have around 3.6-4.0 billion transistors, that's twice as much as GF104/114, the chip it's based on. Would you have doubted so much if Nvidia had made a 768 SP Fermi(ish) part with 256 bit memory interface? Twice the SPs at twice the number of transistors, while keeping 256 bit MC. It's 100% expected don't you think? And now they have this 768 SP "GF124" and it's here where they drop hot-clocks, thus making the SP much smaller, and allowing them to put 2x as many of them: GK104 is born.

    Also remember that doubling SPs per SM is a lot more area/transistor efficient than doubling the number of SMs.

    And to finish, never look at die size for comparing, look at transistor count. Scaling varies a lot from one node to another, and transistor density can change a lot as a node matures, i.e. look at Cypress vs Cayman. GK104 has twice as many transistors as GF104 and that's all that you should look at. It's pointless to even compare to GF100/110, because GF100 is a compute oriented chip, with far more GPGPU features than GF104/114 and GK104. Even GF104 is 60% as big as GF100, but it has 75% of gaming performance.
  19. sergionography

    Joined:
    Feb 13, 2012
    Messages:
    264 (0.30/day)
    Thanks Received:
    33
    yes i believe you man, it was just pretty shocking thats all, now we might be able to compare amd vs nvidia a bit more closely based on shader count
    as for cypress and cayman it seems like it happened from the other extreme isnt it? as far as i remember it was pretty much getting rid of the sps that werent being utilized and change vliw5 to vliw4 and ended up with smaller SM's that performed the same as their predecessor but at a smaller size allowing them to fit more SM's into the 6970 so even though shader count was less, it performed like 20% better.

    though i still think there is still more behind this, having hot clocks has its benefits, but has its limitations too, like i heard they dont scale well when frequency increases, while amd would clock while increasing performance at a constant rate(i could be wrong tho idk much about the bitty details in gpu)
  20. theoneandonlymrk

    theoneandonlymrk

    Joined:
    Mar 10, 2010
    Messages:
    3,332 (2.10/day)
    Thanks Received:
    544
    Location:
    Manchester uk
    yeh ,that was sarcasm from me ,so i agree with you dude:toast:

    but in all honesty im betting these will arrive cheap and be below a 7950 in performance
  21. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.52/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Yes, that's correct and the reason that Nvidia stopped using hot-clocks with Kepler.

    The reason they used hot-clocks before was apparently to have lower latencies and better single threaded/light threaded performance, so that compute apps would benefit. Remember the first chips to have hot-clocks on shaders were running at 600 Mhz core clocks and below, so shaders run at <1200 Mhz. Now even without hot-clocks they will be running at 1000 Mhz so that's probably enough*. Latencies are further reduced with a shorter pipeline (due to lower clocks) and other means that are required for GPGPU anyway.

    Fermi shaders running at 2000 Mhz would have been overkill for what it's really needed and consume more than two 1000 Mhz shaders. A compute GPU needs first and foremost multi-threaded performance, so long as single threaded is not crap, single threaded is only required up to a certain level, so that some minor tasks don't become a bottleneck.
    sergionography says thanks.
  22. jamsbong New Member

    Joined:
    Mar 17, 2010
    Messages:
    83 (0.05/day)
    Thanks Received:
    7
    I'm not aware that I'm in anyway wrong nor am I giving up on anything. All I did was to be rational and put things on hold. You're mistaken again.......
    I guess it is always going to be difficult for me to have a logical debate with someone who is not.
  23. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.52/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Maybe you should start by explaining why if it's only going to be almost as fast as GTX580, why did they put 96 SPs per SM (double) instead of say 64 SPs, or more importantly why did they double up the number of TMUs, when 64 TMU were perfectly fine for GTX580 and GK104 will have 25% higher clocks (thus 25% higher texture fillrate had it had 64 TMU intead of 128). I'm sorry but you just don't increase die size like that if it's not coming with a substantial (read justified) performance increase.

    You have produced ZERO proof (I didn't expect that since nothing is fact), but also explained nothing (which I do expect) about why such a massive increase in computational power -that didn't came for free and suposed a 100% increment in transistor count- is not going to produce any performance gain.

    You have not explained why a 2.9 TFlops card will not be able to beat the 1.5 Tflops card, and why if that'd be the case why didn't they just create a 1.5 TFlops (768 SP) card in the firt place. In the end that would have been easy, same architecture, half the SPs, 48 per SM. If going with 96 SPs is going to make the block 50% as (in)efficient as Fermi with 48 SP, you just don't make it 96 SP!!

    So start by explaining something, anything, and stop calling fanboy as if that was any kind of argument in your favor, because it is not, it only makes you look like a 12 year old kid and an idiot. "It's going to be so, because (you think) it's going to be so, and if you think different you are a fanboy" is not an argument.

    More Logic:

    GK104 is 340 mm2, so close to 4 billion transistor, twice as much as GF104 and 33% more than GF110, logic dictates that Nvidia did not sudenly create an architecture that is at least 33% less efficient than Fermi (70% compared to GF104), 25% higher clocks notwithstanding. Especially when they have been claiming better efficiency for almost 2 years now.
    Last edited: Feb 17, 2012
    Crap Daddy and sergionography say thanks.
  24. Xaser04

    Xaser04

    Joined:
    May 15, 2007
    Messages:
    734 (0.28/day)
    Thanks Received:
    100
    I am struggling to see the point of your argument here. You keep stating that Benetanegia is a fanboy and "wrong" all the time yet so far I have nothing but rational, well thought out posts from him. I may not agree with everything in his posts (actually I do agree with most of it) but I am struggling to see the "fanboy" stance you keep going on about.

    No doubt I will get called a Nvidia fanboy now despite running a HD7970 and Eyefinity.... :wtf:

    One thing that does interest me about Kepler being a dieshrunk and "tweaked" Fermi is how much performance increase we can expect from future driver improvements? Driver improvements are a given with CGN as the architecture is realtively immature but what about Kepler? Could we end up with a case that Kepler comes out the gate faster than Thahiti but ends up slower in the long run due to a lack of driver improvements?

    Obviously this is still conjecture but it is an interesting avenue to investigate as I have seen some pretty big boosts in BF3 (@3560*1920) with the latest HD79xx RC driver (25/01/2012).
    Crap Daddy and sergionography say thanks.
  25. jamsbong New Member

    Joined:
    Mar 17, 2010
    Messages:
    83 (0.05/day)
    Thanks Received:
    7
    @Benetanegia "but also explained nothing (which I do expect)". I've discussed this with you before, since there is no facts whatever you built on is full on nothing. No point getting into explanation mode on speculative information.

    "GK104 is 340 mm2, so close to 4 billion transistor" I am not aware of this information, where did you get 4 billion transistor? Did you estimate it off the 340mm2? in other words, building a case off speculative information?

    @Xaser04 no need to struggle. Just read what I've posted thoroughly and comprehend it before venting off more steam.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page