• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce Kepler Packs Radically Different Number Crunching Machinery

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,680 (7.42/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
NVIDIA is bound to kickstart its competitive graphics processor lineup to AMD's Southern Islands Radeon HD 7000 series with GeForce Kepler 104 (GK104). We are learning through reliable sources that NVIDIA will implement a radically different design (by NVIDIA's standards anyway) for its CUDA core machinery, while retaining the basic hierarchy of components in its GPU similar to Fermi. The new design would ensure greater parallelism. The latest version of GK104's specifications looks like this:

SIMD Hierarchy
  • 4 Graphics Processing Clusters (GPC)
  • 4 Streaming Multiprocessors (SM) per GPC = 16 SM
  • 96 Stream Processors (SP) per SM = 1536 CUDA cores

TMU / Geometry Domain
  • 8 Texture Units (TMU) per SM = 128 TMUs
  • 32 Raster OPeration Units (ROPs)
Memory
  • 256-bit wide GDDR5 memory interface
  • 2048 MB (2 GB) memory amount standard
Clocks/Other
  • 950 MHz core/CUDA core (no hot-clocks)
  • 1250 MHz actual (5.00 GHz effective) memory, 160 GB/s memory bandwidth
  • 2.9 TFLOP/s single-precision floating point compute power
  • 486 GFLOP/s double-precision floating point compute power
  • Estimated die-area 340mm²

View at TechPowerUp Main Site
 
wow... that is definitely different...
 
I bet your mommy always told you to eat your greens ;)
 
its looking like an AMD specification now hehe (wait 32 ROPs? D: )
 
i just hope they a serious about that 2048mb of memory if not it will be a shame.
 
These specs are defiantly strange for an Nvidia chip. 1536 CUDA Cores is triple that of the GTX 580, yet with only a 30% reduction in the size of the fabrication as well the fact that GK104 is smaller then GF110. This only indicates a few things, a "nerf" on the CUDA core itself, or the architecture is much more "cluster based". Very Interesting I'll be following this closely
 
It's a lot more shaders but they're running much slower too. Seems it'd even out on the heat front.
 
just like what the HD 2000 and the present 7000 cards are doing, moar shaders but lower clocks (or rather clocks are tied with the TMU/ROP clocks)
 
My massive loop is waiting for the heat :rockout:
 
Die size is very close to that of 7970 (365mm2). Interesting:cool:
 
Seems like Nvidia's pretty prepped up to wipe AMD off the slate! But what would be the TDP on these things? Preferably lesser than the earlier 480 and 580 heaters. :)
 
Interested In how this is going to be 50% faster then a7970 they seem similar I'm shader layout
 
The end of NV's monolithic GPU era is at hand...was about to say...Bout freaken time! ATI was slower at first when they switched but I knew eventually NV would have to change too.

Very interested to see how well NV does at ATI's own game.
 
this isn't supposed to be NV's flagship anywayz.
 
this isn't supposed to be NV's flagship anywayz.

GK104 so GTX560Ti replacement (ish).

Considering this is 1536 shaders it would be logical to assume that the full fat model would have 2048 shaders, after all the GTX560TI was - in simplistic terms - roughly 75% of a GTX580.

The shader count itself is very interesting.

The increase in shaders (384-1536 if we assume a GTX560TI replacement) would suggest that each Kepler shader is less complex than its Fermi contemporary.

If we also assume similar performance to the HD7950 (doesn't seem to unrealistic) then clock for clock GCN and Kepler could be quite evenly matched (HD7950 has more shaders but a lower core clock).

Should be very interesting.
 
Interested In how this is going to be 50% faster then a7970 they seem similar I'm shader layout

This is not going to be 50% faster than 7970. Judging by the specs it should fall between 7950 and 7970 at a rumored 300$.
GK110 will probably be the Tahiti killer. At a price...
 
At this rate, I will feel compelled to replace my 580. GK110 will likely be 70-80% faster...
 
I assume this specs have been judged legit since Btarunr did post them unlike most others.

Ah crap they are too different, imposible to guesstimate the performance based on them (don't know how other people are so sure). I'll try to make my analysis anyway.

At a first glance it looks like they doubled GF104's shader domain (128 TMU, 4 GPCs, etc.) and then doubled the shader amount per SM because abandoning hot clocks allows for that. Performance wise the end result should be similar.

Based on die size this chip must contain twice the amount of transistors on GF104, while retaining the 256 bit bus, so there's no compelling reason to assume the shaders are any less capable than they were in Fermi. They could have just as easily gone with 768 SPs and hot-clocks within the same die size.

And finally efficiency. That's the key to knowing the performance. We don't know how well they will be able to use all those SP. I'd assume they are using 6x16 SP wide superscalar shader multiprocessors, but with how many schedulers? GF104 had 2. So now they have 4? Or since shaders run at half the speed the schedulers are just issuing the same amount of ops-per-cycle? (in reality cycles-per-op)

So many questions but I had fun. Based on raw specs this chip has the potential to rape any other card on the market, think 2x GTX560 Ti, at least at 1080/1200p. But efficiency/scaling is the key factor and that's completely unknown to us.

EDIT: As you can see, I changed my mind competely as I was writing this post. I first thought they were very different and came to realizing that they are pretty much the same. If you think about Fermi based GF104/114 as a 768 SP chip with no hot-clocks, they just doubled the amount of GPCs.
 
Last edited:
NVIDIA seems that has come with something very similar to GCN from AMD. But after all it's NVIDIA and the successor to Fermi, so we'll have to wait and see performance numbers.
 
I wouldn't take them without a big grain of salt, but it's always fun to do some what iffing.

The specs look similar to what AMD has now, so given the estimated die size and unit counts, I'd say it would reach 580/7950 level performance. I doubt they'll price it at 300$ if 7950 is at 470$. More likely it's at best 50$ cheaper, that's enought to get the ball rolling. It's not really difficult to undercut the 7900 series in price, so regardless of performance it shouldn't be hard for Nvidia to claim a perf/$ crown simply because 7900 is sold at a premium currently. Of course AMD should respond to that, and I think this is the scenario we all hope for.
 
I wouldn't take them without a big grain of salt, but it's always fun to do some what iffing.

The specs look similar to what AMD has now, so given the estimated die size and unit counts, I'd say it would reach 580/7950 level performance. I doubt they'll price it at 300$ if 7950 is at 470$. More likely it's at best 50$ cheaper, that's enought to get the ball rolling. It's not really difficult to undercut the 7900 series in price, so regardless of performance it shouldn't be hard for Nvidia to claim a perf/$ crown simply because 7900 is sold at a premium currently. Of course AMD should respond to that, and I think this is the scenario we all hope for.

A lot of people are holding out for Nvidia just to see prices level out. If they sell a card on par for the 7950 $100 cheaper, they'll make up the difference in volume. I guarantee they would sell twice as many cards as if they priced it around $450.
 
Confirmed Nvidia is doing an ATI!
The specs look so identical that if I rename these specs as say....

HD7870:
256bit GDDR5 2GB memory
1536 CU, 128TMU, 32ROP, small 340mm^2 die size, no hot clocks.

It looks totally believable! Has Nvidia been hiring lots of ATI engineers? or they reversed engineered ATI's Cayman?

Jokes aside, some rational observations:
The specs itself looks like a mid-high end card, will be very competitive price wise as it uses 256bit memory and small die. I won't be surprise that it is only faster than cayman by 10-20%. It will be on par with GTX580 at best.
I believe Nvidia is working on a high end card which has yet to show itself.
 
Charlie seems to be very into Kepler these days. He says the ball is rolling :

"Reports coming in from the far east say that those high up in the priority list started getting Kepler cards in various guises early this week, possibly late last. The number of sightings from sources that SemiAccurate trusts has been going up almost exponentially over the past few days, and will probably keep doing so for a bit."

He concludes:

"If things go as normal, it takes 4-6 weeks from AIB sampling to cards on the shelves. This would mean late March or early April, just like we have been saying for weeks."
 
seriously 1536 shaders? thats 3 x times more than fermi
 
Back
Top