• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

GeForce Kepler 104 (GK104) Packs 256-bit GDDR5 Memory Bus, 225W TDP

. I believe it's good for us that they try to improve everything of this cards but someone who wants the best performance and spends 400-500 $ to obtain it, will not be interested in 20 watts difference if he doesn't play games day and night without interruptions (idle differences are really smaller). It all depends on the point of view. I only hope for competition to help lower prices

fixed...

i think your estimated range is too far apart...

this is from a germany based perspective, as energy prices are pretty high here...
 
Only 256 bit and GDDR5 memory could be a bandwidth limit ???

Not necessarily, no. There's a lot of room in memory clocks. In previous gen Nvidia used <<1000 Mhz GDDR5 clocks, AMD is using 1375 Mhz GDDR5. That's a potential 40% improvement right there, and performance relation to memory bandwidth is not linear. A 40% increase in BW could potentially suffice for an up to 80% performance increase before becoming too much of a bottleneck.

I seriously wonder what that picture has to do with anything of this. If that's from a Kepler tech demo, I'm dissapoint:/

It's the Stonegiant DX11 benchmark, released years (?) ago.
 
Not necessarily, no. There's a lot of room in memory clocks. In previous gen Nvidia used <<1000 Mhz GDDR5 clocks, AMD is using 1375 Mhz GDDR5. That's a potential 40% improvement right there, and performance relation to memory bandwidth is not linear. A 40% increase in BW could potentially suffice for an up to 80% performance increase before becoming too much of a bottleneck.



It's the Stonegiant DX11 benchmark, released years (?) ago.

I think there's not much headroom for GDDR5 speed, since AMD's Tahiti use the same memory clock as previous gen but increase the buswidth from 256 to 384 bit.

And for your mention, nVidia previous gen used 320 and 384 buswidth not a 256 bit like this. That means you need to increase memory clock to somewhat about 1600 - 1800 MHz for BW compensation.

1600 - 1800 MHz GDDR5, i mean... WooooooW thats must be a super special DDR5 :eek:
 
I think there's not much headroom for GDDR5 speed, since AMD's Tahiti use the same memory clock as previous gen but increase the buswidth from 256 to 384 bit.

And for your mention, nVidia previous gen used 320 and 384 buswidth not a 256 bit like this. That means you need to increase memory clock to somewhat about 1600 - 1800 MHz for BW compensation.

1600 - 1800 MHz GDDR5, i mean... WooooooW thats must be a super special DDR5 :eek:

Yes with same GDDR5 AMD went from 256 bit to 384 bits to obtain a 50% increase in memory bandwidth. Nvidia can get almost the same increase by just using the same memory that AMD has been using for 2 generations now. Simple.

Nvidia used 384 bits on their high-end chip, GK104 is NOT high-end. High-end nowadays means GPGPU and GPGPU requires more bandwidth, that's why GF100/110 had a 384 bit bus, and same for Tahiti. High-end==GPGPU also means you need to leave headroom, it means you cannot make compromises, it means going overkill sometimes. Mid-range means you can take compromises, you can cut corners.

Besides GTX560 Ti used a 256 bit bus and 1000 Mhz memory, like I said. To match HD7970 performance they need 50% performance over the GTX560. They don't need 1600-1800 Mhz GDDR5 that's absurd. They don't even need the 40% that 1375 Mhz GDDR5 would bring, because GPU perf is not linearly related to memory bandwidth.
 
Yes with same GDDR5 AMD went from 256 bit to 384 bits to obtain a 50% increase in memory bandwidth. Nvidia can get almost the same increase by just using the same memory that AMD has been using for 2 generations now. Simple.

Nvidia used 384 bits on their high-end chip, GK104 is NOT high-end. High-end nowadays means GPGPU and GPGPU requires more bandwidth, that's why GF100/110 had a 384 bit bus, and same for Tahiti. High-end==GPGPU also means you need to leave headroom, it means you cannot make compromises, it means going overkill sometimes. Mid-range means you can take compromises, you can cut corners.

Besides GTX560 Ti used a 256 bit bus and 1000 Mhz memory, like I said. To match HD7970 performance they need 50% performance over the GTX560. They don't need 1600-1800 Mhz GDDR5 that's absurd. They don't even need the 40% that 1375 Mhz GDDR5 would bring, because GPU perf is not linearly related to memory bandwidth.

i know that GPU performance is not related to memory bandwidth.

But, in many case, insufficient bandwidth can cause severe deduction in graphic performance. ( ex. HD5670 GDDR3 vs HD5670 GDDR5 )


so, u gonna tell me that the bandwidth of 6970 level is enough for 7970 performance.

where's the proof ???
 
well then, only times will tell..
 
i know that GPU performance is not related to memory bandwidth.

But, in many case, insufficient bandwidth can cause severe deduction in graphic performance. ( ex. HD5670 GDDR3 vs HD5670 GDDR5 )


so, u gonna tell me that the bandwidth of 6970 level is enough for 7970 performance.

where's the proof ???

There's no direct proof of that, obviously, however there's hundreds of evidences found on other cards, that demostrate that memory bandwidth is not a heavy limiting factor.

First of all you have to understand that HD7970 did NOT require all the bandwidth that it has. It does need more than HD6970, especially for compute, but it does not strictly need as much as it has. AMD did not have any other option than going 384 bits, because GDDR5 speeds higher than 1400 Mhz are not very doable and are very very expensive anyway. So their only option was a wider bus.

Now:

Evidence #1
192 bit GTX460 has 86 GB/s BW
256 bit 460 has 115 GB/s, that's 33% more BW but performance difference is not much bigger than 5%.

Another example, GTX 480 vs GTX 570, evidence #2

GTX 480 has 177 GB/s
GTX 570 has 152 GB/s - it is slightly faster, despite the 480 having 16% more memory bandwidth.

So is HD7970 kind of performance posible with HD6970 kind of bandwidth? Absolutely.

PS: The HD5670 example you posted, GDDR5 vs GDDR3, you are talking about half the bandwidth which is not going to be the case with GK104 at all (if it really is 256 bit anyway). We would be talking about a 50% reduction is buss width, but an increase of 40% in clocks, for a net bandwidth loss of 10% compared to the GTX580, a card itself is probably NOT limited by it's memory bandwidth anyway.
 
Last edited:
Yes I know, that's why I was wondering why would they demo their new tech with that.

I think it's just Bta posting a random image because there's no picture of Kepler yet.
 
so many nvidia haters! if ati haters went on and on about how something cant be true we would get infractions for "flaimbaiting" etc (i know I've had it happen). just makes little sense to me, if you dont believe it oh well so what who cares??? its a damn graphics card not a political debate for christs sake
 
It's like Microsoft haters, nobody cares about them.
 
There's no direct proof of that, obviously, however there's hundreds of evidences found on other cards, that demostrate that memory bandwidth is not a heavy limiting factor.

First of all you have to understand that HD7970 did NOT require all the bandwidth that it has. It does need more than HD6970, especially for compute, but it does not strictly need as much as it has. AMD did not have any other option than going 384 bits, because GDDR5 speeds higher than 1400 Mhz are not very doable and are very very expensive anyway. So their only option was a wider bus.

Now:

Evidence #1
192 bit GTX460 has 86 GB/s BW
256 bit 460 has 115 GB/s, that's 33% more BW but performance difference is not much bigger than 5%.

Another example, GTX 480 vs GTX 570, evidence #2

GTX 480 has 177 GB/s
GTX 570 has 152 GB/s - it is slightly faster, despite the 480 having 16% more memory bandwidth.

So is HD7970 kind of performance posible with HD6970 kind of bandwidth? Absolutely.

PS: The HD5670 example you posted, GDDR5 vs GDDR3, you are talking about half the bandwidth which is not going to be the case with GK104 at all (if it really is 256 bit anyway). We would be talking about a 50% reduction is buss width, but an increase of 40% in clocks, for a net bandwidth loss of 10% compared to the GTX580, a card itself is probably NOT limited by it's memory bandwidth anyway.

You have NO proof but i have my proof.

3dm11 score of my GTX580@850 and stock BW

http://3dmark.com/3dm11/2588707

GTX580@850 and HD6970 BW ( 1835 mem clocks )

http://3dmark.com/3dm11/2588751

nuff said ??? ;)


ps. i know that in order to bring GTX580 to HD7970 level in 3dm11, i have to push my 580 almost 1000 core clock but 850 core is enough for proving. :)
 
so many nvidia haters! if ati haters went on and on about how something cant be true we would get infractions for "flaimbaiting" etc (i know I've had it happen). just makes little sense to me, if you dont believe it oh well so what who cares??? its a damn graphics card not a political debate for christs sake

lol, are you serious?

this is one of the most civil kept discussions about that topic i have seen in a long time...

people are actually discussing and speculating without any name calling or anything...

and yes, it's a damn graphics card, which is being discussed on a tech enthusiast website...what are we supposed to do? talk about donuts?

you sir, are the one who is trying to cause some stir...so either contribute, or get lost...
 
You have NO proof but i have my proof.

3dm11 score of my GTX580@850 and stock BW

http://3dmark.com/3dm11/2588707

GTX580@850 and HD6970 BW ( 1835 mem clocks )

http://3dmark.com/3dm11/2588751

nuff said ??? ;)


ps. i know that in order to bring GTX580 to HD7970 level in 3dm11, i have to push my 580 almost 1000 core clock but 850 core is enough for proving. :)

lol. That's no proof of anything, because you don't have Kepler. So an overclocked GTX580 (10% OC) with a 10% underclock on the memory does 3% slower in 3Dmark 11 than without underclock. Wow!! That so totally proves your point, man... No.

Besides the fact that 3% is thin air, we are not talking about making a card like yours be as fast as HD7970 and what memory bandwidth it needs for that. Things don't work like that. AMD/Nvidia spend months designing and balancing out their architectures and chips to get the most out of them and tweaking internal latencies and such. You taking your card and absolutely destroying that balance with a 10% core overclock and 10% memory underclock means nothing. But please, by all means try again.

EDIT: At least you proved that AMD and Nvidia do their job and don't just ramdomly choose the specs of cards, but then again looking at how the only difference is 3% maybe you proved the opposite. I just can't choose what you proved yet. In general nothing, other than a GTX580 at 850 Mhz...

And to finish. You artificially created a 20% deficit in memory bandwidth and the most you obtained was 3% less performance. Bravo, because like I said earlier Nvidia could create a card with only a 10% deficit, so 1.5% slower? Aww man, horrible bottleneck. AWWWWW!

/sarcasm
 
lol. That's no proof of anything, because you don't have Kepler. So an overclocked GTX580 (10% OC) with a 10% underclock on the memory does 3% slower in 3Dmark 11 than without underclock. Wow!! That so totally proves your point, man... No.

Besides the fact that 3% is thin air, we are not talking about making a card like yours be as fast as HD7970 and what memory bandwidth it needs for that. Things don't work like that. AMD/Nvidia spend months designing and balancing out their architectures and chips to get the most out of them and tweaking internal latencies and such. You taking your card and absolutely destroying that balance with a 10% core overclock and 10% memory underclock means nothing. But please, by all means try again.

EDIT: At least you proved that AMD and Nvidia do their job and don't just ramdomly choose the specs of cards, but then again looking at how the only difference is 3% maybe you proved the opposite. I just can't choose what you proved yet. In general nothing, other than a GTX580 at 850 Mhz...

And to finish. You artificially created a 20% deficit in memory bandwidth and the most you obtained was 3% less performance. Bravo, because like I said earlier Nvidia could create a card with only a 10% deficit, so 1.5% slower? Aww man, horrible bottleneck. AWWWWW!

/sarcasm

oh... c'mon stop all BS thing.


my GTX580 is not even close to HD7970, but still it has a bottleneck.

imagine Kepler or HD7970@6970 BW couldn't be any faster than mine and thats not only 3% for sure.


at first, you told me that high end gpus have excessive BW, and thats for gpu computing purpose.

First of all you have to understand that HD7970 did NOT require all the bandwidth that it has. It does need more than HD6970, especially for compute, but it does not strictly need as much as it has. AMD did not have any other option than going 384 bits, because GDDR5 speeds higher than 1400 Mhz are not very doable and are very very expensive anyway. So their only option was a wider bus.

then you change your argument and told me Kepler doesn't manage memory bandwidth in the same way as Fermi and SI.

Besides the fact that 3% is thin air, we are not talking about making a card like yours be as fast as HD7970 and what memory bandwidth it needs for that. Things don't work like that. AMD/Nvidia spend months designing and balancing out their architectures and chips to get the most out of them and tweaking internal latencies and such. You taking your card and absolutely destroying that balance with a 10% core overclock and 10% memory underclock means nothing. But please, by all means try again.

what kind of unreliable person you are ??? :confused:


Try proving something ( at least find me some reference that not come from your mouth )

OR stop BS around here !!!
 
bla bla bla[/B]

Bla, bla, bla 3% difference between both of your scores and I'm sure you even went as far as doing many and chosing the ones that showed the biggest difference. Don't worry everyone does that when desperately trying to prove something. Too bad you didn't check what the real difference was. Lame.

And I don't have to prove anything, since I never actually claimed anything. I said that a bottleneck is not warranted, that there's high chances that a bottleneck won't occur and provided REAL evidence of previous cards NOT being bottleneck. The one who says there's going to be bottleneck is you, and the only proof you could provide is a lameass comparison with 3% difference that could be derived from margin of error in 3DMark scoring system or a cat farting down the street. You are not right. Get over it.

EDIT: bah, I decided to be nice and teach you one or two things. Here: http://realworldtech.com/page.cfm?ArticleID=RWT042611035931&p=2

In most of the cases we analyzed, 2X higher memory bandwidth yielded ~30% better 3DMark Vantage GPU performance. A good estimate is that performance scales with the cube root of memory bandwidth, as long the memory/computation balance is roughly intact.

The Radeon HD 3870 and 4670 were the pair we mentioned on the earlier page. The 3870 has 2.13X the memory bandwidth of the latter, which translates into the 36% better performance

In a similar vein, the Radeon 4870 and 4850 achieve 14% and 27% higher 3DMark scores over their bandwidth starved cousins

Note: both have 2x or 100% more bandwidth that their "starved cousins".

The last example pair is the 335M and 4200M, which show somewhat less benefit from bandwidth. The 335M has nearly triple the bandwidth of the 4200M, identical shader throughput, and about 40% higher performance.
 
Last edited:
You have NO proof but i have my proof.

3dm11 score of my GTX580@850 and stock BW

http://3dmark.com/3dm11/2588707

GTX580@850 and HD6970 BW ( 1835 mem clocks )

http://3dmark.com/3dm11/2588751

nuff said ??? ;)


ps. i know that in order to bring GTX580 to HD7970 level in 3dm11, i have to push my 580 almost 1000 core clock but 850 core is enough for proving. :)

Off topic:
Looks like your proc is chocking your 580 like crazy - my 570 at 800Mhz gets a higher p score and graphics score of within 2%. o.O
 
Just some silly rumours with no evidence man.
 
My thoughts exactly
 
I hope that they're right. Bring on a price war.
 
Bla, bla, bla 3% difference between both of your scores and I'm sure you even went as far as doing many and chosing the ones that showed the biggest difference. Don't worry everyone does that when desperately trying to prove something. Too bad you didn't check what the real difference was. Lame.

And I don't have to prove anything, since I never actually claimed anything. I said that a bottleneck is not warranted, that there's high chances that a bottleneck won't occur and provided REAL evidence of previous cards NOT being bottleneck. The one who says there's going to be bottleneck is you, and the only proof you could provide is a lameass comparison with 3% difference that could be derived from margin of error in 3DMark scoring system or a cat farting down the street. You are not right. Get over it.

EDIT: bah, I decided to be nice and teach you one or two things. Here: http://realworldtech.com/page.cfm?ArticleID=RWT042611035931&p=2







Note: both have 2x or 100% more bandwidth that their "starved cousins".

i didn't see anything in the article that prove your argument.

may be u should "try again" :laugh:


oh, and you said you didn't claim anything ???

what is this ??? :laugh:

So is HD7970 kind of performance posible with HD6970 kind of bandwidth? Absolutely.


If i had Kepler IN HANDS and benched it right now, i'm sure u gonna make an excuse like "it's only an engineering sample" anyway. :laugh:
 
Back
Top