Wednesday, October 17th 2012

NVIDIA Kepler Refresh GPU Family Detailed

A 3DCenter.org report shed light on what NVIDIA's GPU lineup for 2013 could look like. According to the report, NVIDIA's next-generation GPUs could follow a similar path to previous-generation "Fermi Refresh" (GF11x), which turned the performance-per-Watt equation around back in favor of NVIDIA, even though the company's current GeForce Kepler has an established energy-efficiency lead. The "Kepler Refresh" family of GPUs (GK11x), according to the report, could see significant increases in cost-performance, with a bit of clever re-shuffling of the GPU lineup.

NVIDIA's GK104 GPU exceeded performance expectations, which allowed it to drive this generation's flagship single-GPU graphics card for NVIDIA, the GTX 680, giving the company time to perfect the most upscaled chip of this generation, and for its foundry partners to refine its 28 nm manufacturing process. When it's time for Kepler Refresh to go to office, TSMC will have refined its process enough for mass-production of GK110, a 7.1 billion transistor chip on which NVIDIA's low-volume Tesla K20 GPU compute accelerator is currently based.

The GK110 will take back the reins of powering NVIDIA's flagship single-GPU product, the GeForce GTX 780. This product could offer a massive 40-55% performance increase over GeForce GTX 680, with a price ranging anywhere between US $499 and $599. The same chip could even power the second fastest single-GPU SKU, the GTX 770. The GK110 physically packs 2880 CUDA cores, and a 384-bit wide GDDR5 memory interface.

Moving on, the real successor to the GK104, the GK114, could form the foundation for high-performance SKUs such as the GTX 760 Ti and 760. The chip has the same exact specifications as the GK104, leaving NVIDIA to tinker with clock speeds to increase performance. The GK114 will be relegated to performance-segment SKUs from the high-end segment it currently powers, and so even with minimal increases in clock speed, the chip will have achieved sizable performance gains over current GTX 660 Ti and GTX 660.

Lastly, the GK106 could see a refresh to GK116, too, retaining specifications and leaving room for clock speed increases, much in the same way as GK114, except, it gets a demotion to GTX 750 Ti, GTX 750, as well, and so with minimal R&D, the GTX 750 series gains a sizable performance gain over its previous generation.Source: 3DCenter.org
Add your own comment

127 Comments on NVIDIA Kepler Refresh GPU Family Detailed

#1
crazyeyesreaper
Chief Broken Rig
i love how everyone is saying AMD will have a hard time competing :roll: did everyone forget that yawn that is the 7970 GHz edition still beat out the GTX 680 and this gen for the most part each company is equal at the typical price points.

8970 is expected to be 40% faster than the 7970

GTX 780 is expected to be 40-55% faster than the 680

add in overclocking on both and we end up with the exact same situation as this generation. So in reality it just plain doesnt matter lol performance is all i care about and who gets product onto store shelfs and from their into my hands. Doesn't matter whos fastest if it takes 6 months for stock to catch up.
Posted on Reply
#2
hoodlum
Low Power?

If you go back to the original linked article the performance gains for the GK114 and GK116 will only be 5-15%. That seems quite low considering the improvements to memory bandwidth, shaders, ROPs, etc. That would suggest nvidia may be focusing on even lower TDP than pure performance increases. And prices will be increasing too.

I think people may be disappointed by the time these are released. I suspect AMD will show similar improvements next year as well with more focus on TDP.
Posted on Reply
#3
BigMack70
by: crazyeyesreaper
i love how everyone is saying AMD will have a hard time competing :roll: did everyone forget that yawn that is the 7970 GHz edition still beat out the GTX 680 and this gen for the most part each company is equal at the typical price points.
I think, from reading pretty much every review of these cards, that the general impression this round is (wrongly) more favorable to Nvidia than AMD, and this carries over into forums/etc.

AMD did this to themselves because they released their 79xx cards fairly horridly underclocked (especially the 7950), and at price points that were too high. They didn't make a move on either front soon enough, and so when Kepler finally hit, reviewers were left looking at a situation where the 7970 was outperformed by a cheaper card. Then the 670 came in, trashed the 7950, and competed with AMD's previously $550 card at $150 less.

Those things defined the impressions most people have of this round. AMD then made the mistake of releasing their GHz edition as a reference card for reviewers, and most reviewers then dismissed it as too loud/etc.

You have to do a decent amount of homework before you start realizing that both companies at this point in time are pretty much dead even, and most people don't like to think that hard.

If AMD had released their 7970 clocked around 1050/1500 MHz for $500 at launch, and their 7950 at maybe 950/1400 for $400, I can guarantee you that the impressions would be different. Pretty much every single 7970/7950 will hit those clocks without messing with voltages, so I have no idea why they got so conservative. But they didn't make those moves, and so here we are.
Posted on Reply
#4
crazyeyesreaper
Chief Broken Rig
they were conservative in order to get better yields essentially most chips yes can do 1050 but not all can at the proper voltage or TDP level, they also have to harvest chips for the 7950 lower clocks meant more chips more usable chips means greater volume to put on store shelves.

Regardless the refresh will probably see Nvidia take the lead but not by a whole lot they have more room to play when it comes to TDP than AMD does right now.
Posted on Reply
#5
BigMack70
I understand they did it for better yields, but I haven't seen a 7970 that wouldn't do 1050 on stock volts. I'm sure they're out there, but they've gotta be a tiny minority. I think AMD just flat out screwed up figuring out how they needed to clock their cards for viable yields.
Posted on Reply
#6
crazyeyesreaper
Chief Broken Rig
probably but it doesnt matter much most overclocked 7970s on the market were already 1000-1100 mhz before the GHz edition dropped lol but i digress looking at the info available if AMD limits themselves to 32 ROPs again but increases shader count they will be beaten by NVIDIA. should AMD wise up and increase ROP count to 48 they stand a good chance of being within reach in that pre - overclocked models should fair well against Stock 780 time will tell of course.
Posted on Reply
#7
james888
by: crazyeyesreaper
probably but it doesnt matter much most overclocked 7970s on the market were already 1000-1100 mhz before the GHz edition dropped lol but i digress looking at the info available if AMD limits themselves to 32 ROPs again but increases shader count they will be beaten by NVIDIA. should AMD wise up and increase ROP count to 48 they stand a good chance of being within reach in that pre - overclocked models should fair well against Stock 780 time will tell of course.
Can you explain what a ROP is and why it is/might be bottlenecking the 7970?
Posted on Reply
#8
crazyeyesreaper
Chief Broken Rig
http://en.wikipedia.org/wiki/Render_Output_unit

Look back at the 5850 and 5870

clock both to the same clock speed the 5850 with less shaders but same ROP count was within 1-2% of the 5870 so increased shader count didnt do a whole hell of a lot

with GCN shaders scale a bit better yes but notice

7870 1280 GCN stream processors and 32 ROPs can take on the 7950 which is 32ROPs 1792 shaders etc

looking at previous GPUs

7770 = 640 shaders 16 ROPs, 10 Compute Units, 40 TMUs - 3Dmark 11 P3500
7870 = 1280 shaders 32 ROPs, 20 Compute Units, 80 TMUs - 3Dmark 11 P6600
7970 = 2048 shaders 32 ROPs, 32 Compute Units, 128 TMUs - 3Dmark 11 P8000

what 7970 probably looked like if following AMDs previous design philosphy
1920 shaders 48 Rops, 30 Compute Units, 120 TMUs add in higher GPU clock

for the 8970 being at the same 28nm its looking like AMD will push for 2500-2600 shaders many are saying 2560 but no one knows for sure yet

thats 25% increase in shaders however we can see from the 7870 to 7950 a 20-30% increase in shaders didnt do much for performance

AMD needs more ROPs and higher clocks for GCN to scale well with a large number of stream processors

so with just increasing shaders AMD wont get far they will need to up the # of compute units as well as TMUs and with that ROPs count needs to be bumped up to maintain a balanced GPU design Tweaks in architecture will help but a simple bump in shaders would mean that a heavily clocked 7970 could possible catch the 8970 if the basis of 40% is compared to the 925 Mhz stock cards in which case we see the 7970 at full overclocks pulling as far as 20% faster right now on avg. that would make a stock 8970 just 20% faster so a better balance and more optmized design is necessary.

NVIDIA already has their design finished, AMD on the other hand we can only hope didnt screw the pooch.
Posted on Reply
#9
Xzibit
by: Recus
http://www.bit-tech.net/news/hardware/2012/04/19/qualcomm-28nm-capacity/1
http://wccftech.com/amd-28nm-processors-delayed-2014/ (Global Foundries problems)
That deals with capacity something that nvidia complains very little of. The past 3 quaters they've "nvidia" has been complaining about wafer yields since they moved to a buy per wafer instead of a per working die.
Look up any Nvidia transcript this year and 28nm yields issues along with margins will be the dominate fall-back.

Nvidia is currently in talks with Samsung to use its 28nm fabs but Samsung is more expensive and Nvidia only uses Samsung for initial fab of desings and looks to Global Foundries and TSMC for production.
Samsung will have a open slot given there recent litigation with Apple and companies like Qualcomm, Nvidia and others will be looking to fill in that slot and Samsung will charge a premium i'm sure.
Posted on Reply
#10
GoldenTiger
by: Selene
This just proves every thing I and many others said early on, the GTX 680 was to be the GTX 660Ti but AMD flopped and left the door open for Nvidia to cash in on a mid range part at high end prices. I could not wait and went with a pair of GTX670 FTW's to feed my needs and will do me for a few years I think.
It proves nothing. In fact, if anything, it shows nVidia didn't have a great, available GK100. Now that GK110 came out well, they may be releasing it as the high-end. You really need to not be so hung-up on codenames.

by: BigMack70
It's still too early to really know what's going on... we have rumors saying GK110 won't be a GTX 7xx card, rumors saying it is, rumors saying that the performance increase will be 15-25% on both sides, and now this, which is probably relying on old rumors from last January of the GK110 being ~45% faster than the 7970.

The 7xxx and 6xx round was pretty much a tie and I expect that to continue next round with no major shakeups from either green or red.
Considering the Tesla card specs have been outed by a CAD vendor recently accidentally (K20 card) with them up for order of GK110.... and 3dcenter tends to be pretty knowledgeable.... I would put my bet on this rumor being fairly accurate, pending good clock speeds at release for the GeForce variant.

Also, a useful post from OCN and my reply:

-----

by: Nowyn

Say the rumor is true.
We have 2496 CUDA Cores out of 2880 ie 2 SMX clusters disabled. That gives us 2496 - 1536 = 960 extra cores which is 62.5% more.
There are 12 more ROPs so there's 50% increase from 24 in GK104
Plus 384-bit bus which is 50% wider.
Sure core clock will be lower which so that will lower theoretical ROP and core performance increase. At 700 MHz it would result in 35% more ROP and 43,75% core performance increase. Memory controller is probably also tweaked, providing more than 50% bandwidth increase over GK104.
So 40%-50% ballpark is quite realistic depending on the final clocks.
Exactly... and we may see further optimizations ala the GF104 vs. GF114. I doubt it'll come in at "just" 700mhz, but if it does, it's still not outside the realm of possibility that it could be 50% faster out of the box.
Posted on Reply
#11
BigMack70
There's more than enough evidence to substantiate that the GK104 was drawn up to be the 660ti and not the 680...
Posted on Reply
#12
GoldenTiger
by: BigMack70
There's more than enough evidence to substantiate that the GK104 was drawn up to be the 660ti and not the 680...
Oh, perhaps it was originally, but GK100 was certainly not "held back" so they could "put out a midrange card as high-end for mad profits!!!!" as some people like to proclaim.
Posted on Reply
#13
cadaveca
My name is Dave
by: GoldenTiger
Oh, perhaps it was originally, but GK100 was certainly not "held back" so they could "put out a midrange card as high-end for mad profits!!!!" as some people like to proclaim.
This is always what I thought. If nVidia could truly release a card twice as fast as what AMD has, using the same foundry, then they would, since that would ensure far more sales and profit than selling something that "saves on costs" instead.

In fact, had nVidia done this, to a degree, would amount to price fixing, and of course, is illegal.

Of course, now that both cards are here, and we can see the physical size of each chip, we can easily tell that this is certainly NOT the case, at all, so whatever, it's all just marketing drivel.

In fact, it wouldn't really be any different than AMD talking about Steamroller. :p "Man, we got this chip coming...";)
Posted on Reply
#14
BigMack70
by: GoldenTiger
Oh, perhaps it was originally, but GK100 was certainly not "held back" so they could "put out a midrange card as high-end for mad profits!!!!" as some people like to proclaim.
You are correct... the idea that it was intentionally held back is nonsense. However, the chip did disappear among a ton of rumors about yield problems, so it seems best to reason that they were forced into holding it back due to poor yields. Fortunately for them, they were able to hit the performance target they needed (set by AMD) with GK104.

Wound up being a big win for them on the business side of things (because it IS a midrange card from a manufacturing point of view, with a high end price) and a loss for consumers (who lost out on potentially much greater performance).
Posted on Reply
#15
crazyeyesreaper
Chief Broken Rig
more likely it was held back because Nvidia needed to release something rather than face ongoing delays like they did with Fermi aka GTX 480 Gk104 offered plenty of performance and allowed them to keep GK110 in the wings for a refresh it essentially gave them a performance boost for the next series without need much more input and instead gave them time to further tweak the chip.

Its better to release a product when its truly ready than to release early with massive issues my guess is with Kepler Nvidia learned from their mistakes with Fermi and to great effect.
Posted on Reply
#16
Xzibit
by: BigMack70
There's more than enough evidence to substantiate that the GK104 was drawn up to be the 660ti and not the 680...
If the GK104 was a true mid-size Nvidia would be making out like thiefs with a very profitable mid-range chip. Thats not what Nvidia has been saying in there quarterly reports and conference call to investors. They have been voicing concerns about production, yields and margins since there first report this year.
That theory doesnt really reflect Nvidias own stance and it makes less sense given 2 quater straight AMD has gain market share in discrete graphic sector.

Think thats more of a forum myth driven by fanboyism.

Think about it. As a company your loosing market share and sales down 1million units sold form quater to quarter. You'd think it be the opposite if your selling a mid-range chip at great profit for the high-end market.

If for some weird reason that would be true then its a horrible design and execution.
Posted on Reply
#17
renz496
by: GoldenTiger
Oh, perhaps it was originally, but GK100 was certainly not "held back" so they could "put out a midrange card as high-end for mad profits!!!!" as some people like to proclaim.
the way i heard it GK100 was not held back but it was scrapped and redesign into GK110. IMO if AMD able to put out much better performance out of 7970 from the launch day maybe nvidia will be forced to use that scrapped GK100 as their flagship. but luckily for nvidia amd choose to be conservative with 7970 clock and somewhat nvidia was able to make GK104 to match 7970 performance. lol i think originally nvidia wants GK104 to be clocked around 700mhz and intend to market the card with 'overclockers dream' slogan just like they did with 460 and 560. :roll:
Posted on Reply
#18
atikkur
by: st.bone
Why Nvidia why? I ordered a GTX670 and its still stuck at the customs, and now this news. For the same price next year a probable GK110 for GTX780 & GTX770. Then that will mean a GTX760TI will preform equal to or better than GTX680, that will make GTX760 perform equal to or better then GTX670, and all with better prices than what people are paying for currently.... why Nvidia? I love performance increase but i already feel bad, i should think of selling the GTX670 sometime early next year if this information turns out to be true.
only buy nvidia at their revision stage,, that is their second refreshs after their major architecture change. GK110 looks sweet.
Posted on Reply
#19
BigMack70
If you assume that GK104 was drawn up originally to be the 680, as it eventually was, you have to come up with an explanation for:

-All the rumors and leaked info until late Jan/Feb of this year which had the GTX 680 being based on the GK110. That wasn't one or two isolated rumors... there was tons of info floating around indicating that to be the case. Almost NOTHING indicated GK104 to be the high end chip, not until GK110 completely disappeared and rumors of yield problems started cropping up all over.
-The limited memory bus (256 bit) on the GK104, which is typically reserved for mid level cards and not high-end
-The PCB design itself, most notably as it appears on the 670 (which is close to being a half-length PCB in the reference designs).

If you assume that GK110 was originally supposed to be the 680 and GK104 was to be the 660ti, as I do, it makes sense of the above information quite well. As for Nvidia not "making out like [a thief]", the explanation for that is readily apparent in their yield problems, which affected GK104 as well (remember - the GTX 680 was basically a paper launch for 2+ months). Also, aren't desktop GPUs a relatively low-profit/revenue area anyways from a business perspective?

We'll never know with 100% certainty, but I think that it makes better sense of the available data that the original GTX 6xx lineup was to include both Gk110 (680/670?) and GK104 (660ti/660).
Posted on Reply
#20
cadaveca
My name is Dave
by: BigMack70
you have to come up with an explanation for:
You do not have to explain anything.

Period.


Die sizes say GK100 or whatever was never possible.

HD 7970:



GTX 680:




Note how the AMD chip has nearly 33% more transistors, but is barely physically larger than GTX 680.

If nVidia could have fit more functionality into the same space, they would have.


They could have planned to release something different all they wanted, but if they had, that chip would have to have been quite a bit larger than HD 7970 is.

Since nvidia is selling a chip that is much the same size as 7970. per wafer ,they aren't getting that many more chips.


If Nvidia is selling a mid-range chip as high-end, they either have HUGE HUGE HUGE design issues,


OR AMD is doing the exact same thing.


:roll:


Fact isd, GTX 680 ain't no mid-range chip, unless you beleive that most of that there chip is deactivated.
Posted on Reply
#21
BigMack70
by: cadaveca
You do nto have to explain anything.

Die sizes say GK100 or whatever was never possible.
This doesn't make much sense... why do we now have rumors of that same GK110 being released? Die size constraints will still be there... if the die size were the inherent problem here, GK110 would have been scrapped and we wouldn't be reading this article right now.
Posted on Reply
#22
cadaveca
My name is Dave
by: BigMack70
GK110 would have been scrapped and we wouldn't be reading this article right now.
it WAS scrapped.
Posted on Reply
#23
BigMack70
by: cadaveca
it WAS scrapped.
Did you read the article?
When it's time for Kepler Refresh to go to office, TSMC will have refined its process enough for mass-production of GK110, a 7.1 billion transistor chip on which NVIDIA's low-volume Tesla K20 GPU compute accelerator is currently based.
The GK110 will take back the reins of powering NVIDIA's flagship single-GPU product, the GeForce GTX 780. This product could offer a massive 40-55% performance increase over GeForce GTX 680, with a price ranging anywhere between US $499 and $599. The same chip could even power the second fastest single-GPU SKU, the GTX 770. The GK110 physically packs 2880 CUDA cores, and a 384-bit wide GDDR5 memory interface.
Doesn't sound like "scrapped" to me... unless you want to argue that this is just bogus, which it could be.
Posted on Reply
#24
cadaveca
My name is Dave
by: BigMack70
Did you read the article?



Doesn't sound like "scrapped" to me... unless you want to argue that this is just bogus.
I'm not arguing that it is bogus.

Not at all.

But the fact of the matter is that what nVidia can do with TSMC's 28nm, AMD can as well.

And AMD's already 33% more efficient in used die space.

If you beleive the 7.1 billion transistor thing, than it must be twice as big as current GTX680 silicon(3078 Million transitors, BTW), or current GTX 680 really is a horrible horrible design, and it's a feat of wonder that nvidia managed to get it stable.

And how does a doubling of transistors only equal a 55% increase in performance?

Oh, I read it just fine. :p


Argue that it's bogus... :roll:
Posted on Reply
#25
MxPhenom 216
Corsair Fanboy
by: crazyeyesreaper
i love how everyone is saying AMD will have a hard time competing :roll: did everyone forget that yawn that is the 7970 GHz edition still beat out the GTX 680 and this gen for the most part each company is equal at the typical price points.

8970 is expected to be 40% faster than the 7970

GTX 780 is expected to be 40-55% faster than the 680

add in overclocking on both and we end up with the exact same situation as this generation. So in reality it just plain doesnt matter lol performance is all i care about and who gets product onto store shelfs and from their into my hands. Doesn't matter whos fastest if it takes 6 months for stock to catch up.
It doesn't matter. If you look at most recent performance numbers they trade blows. Thats how it has been for the last 2-3 generations. Only reason the 680 truely looks like the better card is because it consumes a lot less power then the 7970 for the same performance range.
Posted on Reply
Add your own comment