Thursday, March 8th 2012

GK104 Dynamic Clock Adjustment Detailed

With its GeForce Kepler family, at least the higher-end parts, NVIDIA will introduce what it calls Dynamic Clock Adjustment, which adjusts the clock speeds of the GPU below, and above the base-line clock speeds, depending on the load. The approach to this would be similar to how CPU vendors do it (Intel Turbo Boost and AMD Turbo Core). Turning down clock speeds under low loads is not new to discrete GPUs, however, going above the base-line dynamically, is.

There is quite some confusion regarding NVIDIA continuing to use "hot clocks" with GK104, the theory for and against the notion have been enforced by conflicting reports, however we now know that punters with both views were looking at it from a binary viewpoint. The new Dynamic Clock Adjustment is similar and complementary to "hot clocks", but differs in that Kepler GPUs come with a large number of power plans (dozens), and operate taking into account load, temperature, and power consumption.

The baseline core clock of GK104's implementation will be similar to that of the GeForce GTX 480: 705 MHz, which clocks down to 300 MHz when the load is lowest, and the geometric domain (de facto "core") will clock up to 950 MHz on high load. The CUDA core clock domain (de facto "CUDA cores"), will not maintain a level of synchrony with the "core". It will independently clock itself all the way up to 1411 MHz, when the load is at 100%.Source: VR-Zone
Add your own comment

56 Comments on GK104 Dynamic Clock Adjustment Detailed

#1
Legacy-ZA
This is going to do more harm than good.
Posted on Reply
#2
radrok
by: [H]@RD5TUFF
Good idea IMO, as long as it's implimented in an intelligent manner, and performs as well if not better than AMD, Nvidia has a winner on their hands IMO.
I agree, it's a good step in thermal management too :)
Posted on Reply
#4
jamsbong
dynamic clocking on GPU is not new. normally, the cards would have 2D at a low 300mhz and 700+Mhz for 3D. The "new" feature in Kepler is that it is able to go beyond the spec speed of 700+Mhz if the thermo-envelope permits it.

So if the situation is a very straight forward computation work and does not heat up the GPU at 700Mhz, it will overclock itself to say... 1000Mhz so that the task can be done more quickly while staying within the thermo-envelope.

Why not have it on previous GPU? I believe this type of thermo management requires additional hardware sensors built-in to monitor the GPU precisely and ensure that it does not get cooked. It can be done in software but the profile can't be as aggressive.

This is how Intel CPU works. I'm guessing that is how Kepler will work.
Posted on Reply
#5
LAN_deRf_HA
So if we force the fan faster or put on a better cooler it will stay on the high end?
Posted on Reply
#6
NHKS
To me the OC to 950 @ High load & downclock to 300MHz @ Low load seems to be nVidia's version of AMD's 'PowerTune' & 'Zero Core'(although Zerocore can bring down consumption to about 3W by turning off fan too).
Bringing down speed to 300MHz might compel us to doubt the efficiency & reliability at higher speeds but I believe it really is a power saving feature rather than a way to mitigate any of Kepler's problems.
Again, lets hope for a competitive Kepler series rather than a below par one! Its good for us consumers.
Posted on Reply
#7
jamsbong
Not sure if Kepler will be exactly the same as Intel CPU. In the case of Intel CPU, you'll get less throttling if you have better cooling (assuming you've free-up your CPU via bios correctly).
Posted on Reply
#8
Mussels
Moderprator
by: m1dg3t
Don't ATi card's already have a "throttle" function? They run @ XXX clock's for desktop/2d/media then when gaming/rendering or 100% load they ramp up to full clock's?

Could you not merge this with the other thread about the same topic? Would be nice to have all the info in the same place :o
yeah, but ATI have four power states total (one being 'off' for secondary cards in crossfire, ULPS)


this adds dozens, so now they have more than just a 2D or 3D state - now a low demanding game with Vsync on or an FPS cap, it simply wont use the same amount of power, heat and noise.
Posted on Reply
#9
NHKS
by: jamsbong
Not sure if Kepler will be exactly the same as Intel CPU. In the case of Intel CPU, you'll get less throttling if you have better cooling (assuming you've free-up your CPU via bios correctly).
Logically, from a thermal envelope standpoint, there is bound to be more throttling of GPU OC because mid/high-end GPUs in general have higher 'TDP' than CPU and it has to be kept within limits.
Hence you cannot exactly match the OC characteristics/capability of CPU and GPU but only compare to an extent on the basis of computational load.
Posted on Reply
#10
Aquinus
Resident Wat-man
by: Mussels
yeah, but ATI have four power states total (one being 'off' for secondary cards in crossfire, ULPS)


this adds dozens, so now they have more than just a 2D or 3D state - now a low demanding game with Vsync on or an FPS cap, it simply wont use the same amount of power, heat and noise.
I just crossfired my 6870 and I noticed this. Pretty fancy feature. Like I said before, this all can be implemented at the driver level, this has nothing to do with changes to the architecture.
Posted on Reply
#11
Mussels
Moderprator
by: Aquinus
I just crossfired my 6870 and I noticed this. Pretty fancy feature. Like I said before, this all can be implemented at the driver level, this has nothing to do with changes to the architecture.
well part of it has to be hardware, to have it able to support the states correctly in the first place.

the rest is on the firmware i guess, because each card behaves different much like CPU's, so the profiles must be set in the firmware on each card (and could be modified by the end user later)
Posted on Reply
#12
Aquinus
Resident Wat-man
by: Mussels
well part of it has to be hardware, to have it able to support the states correctly in the first place.

the rest is on the firmware i guess, because each card behaves different much like CPU's, so the profiles must be set in the firmware on each card (and could be modified by the end user later)
A lot of video cards support altering voltage and clocks at the software level now. Yes, firmware changes would be required if the hardware manages this, but if the software is, the technology is already there via the I2C bus, as least on 6800 and 6900 series cards.
Posted on Reply
#13
Mussels
Moderprator
by: Aquinus
A lot of video cards support altering voltage and clocks at the software level now. Yes, firmware changes would be required if the hardware manages this, but if the software is, the technology is already there via the I2C bus, as least on 6800 and 6900 series cards.
but they cant do it dynamically with so many increments.


this isnt off/2D/3D/OC, its going to have ten steps in between each of those.

lets say you fire up team fortress 2 and leave Vsync on: the card might only clock up to 30% of its total power to do so, and it will sit there, saving you power, heat, and noise.


so instead of running at 100% clocks and 30% load, it could now run at 30% clocks at 100% load, if that makes sense to you.
Posted on Reply
#14
Aquinus
Resident Wat-man
by: Mussels
but they cant do it dynamically with so many increments.


this isnt off/2D/3D/OC, its going to have ten steps in between each of those.

lets say you fire up team fortress 2 and leave Vsync on: the card might only clock up to 30% of its total power to do so, and it will sit there, saving you power, heat, and noise.


so instead of running at 100% clocks and 30% load, it could now run at 30% clocks at 100% load, if that makes sense to you.
Oh, I do understand, I'm just saying that software has direct control of voltages, core clocks, memory clocks, and gpu load. This can be done dynamically in software, that is my point. You don't need power "states" to dynamically adjust clocks and voltages, as long as you have access to the I2C bus. It's a matter of how well and how quickly it can be done because doing things on hardware will always be faster than software. I'm not disagreeing with you, I'm just saying it isn't necessary and what is needed is practically already there.
Posted on Reply
#15
Steevo
How could Nvidia spin a high TDP that prevents their card from exceeding competition?

Simply lower clocks and introduce the clocks-performance level they were aiming for as a new "feature".


This is NOT like turbo, as that slows other unused cores down to maintain the same power envelope, it doesnt speed all cores up, what Intel and AMD call a overall faster chip is a new product name, not a feature.
Posted on Reply
#16
EarthDog
by: cadaveca
I do not understand the purpose of this. The way it is presented suggests to me that nVidia had another Fermi on their hands, and the card cannot handle high clocks all the time without having issues. This seems the opposite of power saving to me, as lowering the clocks under lower load would lead to higher GPU utilization, which just doesn't make sense.


It's like if they let the card run high FPS, it can pull too much current? I mean, there's no point in running 300 FPS in Unreal or Quake 4, and in these apps, a slower GPU would still give reasonable framerates when downclocked. So they are saving power by limiting FPS?

I HAZ CONFUUZ!!!
Correct me if I am wrong here, but wouldnt 300Mhz @ .8v use less power than 850Mhz @ 1.1v? It still takes 1.1v at 850Mhz regardless of % load, no? I do not know enough about this stuff.

Im just as confused, however, seeing yet another news bit, the TDP is 195W, substantially lower than the 7970, and faster?
PHK got the first benchmark numbers in for 3DMark11 confirmed GTX 680 will be faster than Radeon HD 7970.

680-> X3200~3300
(7970)-> X2700~2800
670Ti -> X2500~2600
(7950) -> X2200~2300

PHK also mentioned that GTX 680 has max TDP 195W and fan will have low noise.
It was also mentioned/rumored that the turbo is only 7% of stock clocks. I mean, that nets you maybe a couple FPS, usually not the difference between playable and not, ya know? Im leaning on the side of pointless myself.
Posted on Reply
#17
cadaveca
My name is Dave
by: EarthDog
Correct me if I am wrong here, but wouldnt 300Mhz @ .8v use less power than 850Mhz @ 1.1v? It still takes 1.1v at 850Mhz regardless of % load, no? I do not know enough about this stuff.
I was thinknig that runnign 500FPS in quake 3 would pull more power, potentially, than the 120FPS or so amx that any monitor can display. So it makes sense to limited teh Gup speed to lower power cunsumption in that instance.

But to increase speed while under full load? Why not just have that full power available when needed, and just worry about the lower-load scenarios?

Admittedly I know probably less than you do, as GPUs really aren't my thing, so I was serious in that I am confused about this, and need more info, or a different way to explain why they are doing it this way.

And if anything, this relates to TSMC having issues...AMD and Nv seemingly have just chosen to deal with it differently. We know that the current 7-series cards have HUGE OC potential...to me it doesn't make sense that they didn't release those cards @ 1000 MHz and let the OEMs have 1100-1125 for OC editions...

I'm not saying what nV is doing is wrong, but that it's weird, and curious, and I'd like to know more.
Posted on Reply
#18
Aquinus
Resident Wat-man
by: cadaveca
I was thinknig that runnign 500FPS in quake 3 would pull more power, potentially, than the 120FPS or so amx that any monitor can display. So it makes sense to limited teh Gup speed to lower power cunsumption in that instance.

But to increase speed while under full load? Why not just have that full power available when needed, and just worry about the lower-load scenarios?

Admittedly I know probably less than you do, as GPUs really aren't my thing, so I was serious in that I am confused about this, and need more info, or a different way to explain why they are doing it this way.

And if anything, this relates to TSMC having issues...AMD and Nv seemingly have just chosen to deal with it differently. We know that the current 7-series cards have HUGE OC potential...to me it doesn't make sense that they didn't release those cards @ 1000 MHz and let the OEMs have 1100-1125 for OC editions...

I'm not saying what nV is doing is wrong, but that it's weird, and curious, and I'd like to know more.
I think AMD is clocking their hardware just enough to keep it that much over nVidia's current line-up. It would make sense, even more so with the power consumption of the 7900-series GPUs at stock speeds.
Posted on Reply
#20
Steevo
by: cadaveca
I was thinknig that runnign 500FPS in quake 3 would pull more power, potentially, than the 120FPS or so amx that any monitor can display. So it makes sense to limited teh Gup speed to lower power cunsumption in that instance.

But to increase speed while under full load? Why not just have that full power available when needed, and just worry about the lower-load scenarios?

Admittedly I know probably less than you do, as GPUs really aren't my thing, so I was serious in that I am confused about this, and need more info, or a different way to explain why they are doing it this way.

And if anything, this relates to TSMC having issues...AMD and Nv seemingly have just chosen to deal with it differently. We know that the current 7-series cards have HUGE OC potential...to me it doesn't make sense that they didn't release those cards @ 1000 MHz and let the OEMs have 1100-1125 for OC editions...

I'm not saying what nV is doing is wrong, but that it's weird, and curious, and I'd like to know more.
More than likely the early adoption to new process helped AMD as it did ATI in many cases, 5770 anyone, and the extra things like more power interconnects per layer making a larger chip overall means more stable power delivery and thus the ability to run lower overall voltage, and gain a larger yield from the chips produced.

Nvidia has historically failed to allow for much manufacturing error, meaning a substantially lower yield on new processes, and clock/heat issues resulting from it.


If you have a voltage drop of .2v in the core and you have a targeted speed of 1Ghz at 1.2vcore, you then have to run 1.4vcore to achieve your target numbers, but at a hugely increased thermal load. I am guessing this is Nvidias problem with this chip, but also why they have been able to beat AMD/ATI in performance per mm. How many threads about dead 8800's do we have due to heat issues? Lower the voltage and heat output and your competitive advantage dies when your clocks fall, and your yields suffer.
Posted on Reply
#21
Aquinus
Resident Wat-man
by: Steevo
More than likely the early adoption to new process helped AMD as it did ATI in many cases, 5770 anyone, and the extra things like more power interconnects per layer making a larger chip overall means more stable power delivery and thus the ability to run lower overall voltage, and gain a larger yield from the chips produced.

Nvidia has historically failed to allow for much manufacturing error, meaning a substantially lower yield on new processes, and clock/heat issues resulting from it.


If you have a voltage drop of .2v in the core and you have a targeted speed of 1Ghz at 1.2vcore, you then have to run 1.4vcore to achieve your target numbers, but at a hugely increased thermal load. I am guessing this is Nvidias problem with this chip, but also why they have been able to beat AMD/ATI in performance per mm. How many threads about dead 8800's do we have due to heat issues? Lower the voltage and heat output and your competitive advantage dies when your clocks fall, and your yields suffer.
Touchè, friend. You post has been one of the most sensible I've read in a while. nVidia cards to seem to have a lot of heat issues. I've never had an ATi/AMD video card fail on me, but I have lost a GeForce 7900 GT to the jaws of death (vram death, that is). For the power that AMD chips use, they're efficient and the run well and even AMD's Llano chips are proof how you can run power usage low enough and still get a reasonable amount of performance out of an APU. You don't cram more shaders on your GPU by using more power. You make them more efficient and smaller and then you cram more of them on, so when you do overclock, a little extra goes a long way.

With this all said though, I think Kepler is going to be screaming fast, but how much more does your electricity bill have to be to gain that performance and is it worth using your computer as a mini-space heater?
Posted on Reply
#22
xenocide
The rumors have said it will use substantially less power, and withe the Dynamic Clocks it could be even less than that. I think there should be little doubt that a GTX680 will be faster than an HD7970, but the real issue is cost, power consumption, and noise. Nvidia has everything to gain by sticking to their guns and selling this is a mid-range GPU. Their sales would skyrocket. But knowing how American Economics work (See: Crazy) it will probably be priced competatively and sold for a little more than the HD7970.
Posted on Reply
#23
Steevo
Everything so far except that in a heavily NV game it is only 10% faster and has the same power plus, a smaller die, as large of a cooler, and a new "feature" that has yet to be proven beneficial, is logs in the toilet.
If it were so much cooler they would have used a single slot cooler.
If it was so much faster they would be shouting it from the roof.
If it were available........but its not.

So here we are. Speculation about a feature that mah help fix a problem no one had or cared about, or just some media spin from marketing to generate some green fog in our brains.
Posted on Reply
#24
Aquinus
Resident Wat-man
by: Steevo
Everything so far except that in a heavily NV game it is only 10% faster and has the same power plus, a smaller die, as large of a cooler, and a new "feature" that has yet to be proven beneficial, is logs in the toilet.
If it were so much cooler they would have used a single slot cooler.
If it was so much faster they would be shouting it from the roof.
If it were available........but its not.

So here we are. Speculation about a feature that mah help fix a problem no one had or cared about, or just some media spin from marketing to generate some green fog in our brains.
Hopefully we will find out later this month.
Posted on Reply
#25
xenocide
by: Steevo
Everything so far except that in a heavily NV game it is only 10% faster and has the same power plus, a smaller die, as large of a cooler, and a new "feature" that has yet to be proven beneficial, is logs in the toilet.
If it were so much cooler they would have used a single slot cooler.
If it was so much faster they would be shouting it from the roof.
If it were available........but its not.

So here we are. Speculation about a feature that mah help fix a problem no one had or cared about, or just some media spin from marketing to generate some green fog in our brains.
Where did you hear that performance rating? I saw someone say it was 5-10% faster than the HD7970, which is definitely not a "heavily NV game".

Every time people bring up Nvidia cards they feel obligated to describe them as monsterous, hot, power-plant fueled alternatives to AMD's offerings. Clearly people care about Power Consumption, and heat is always an issue, so I have no idea what you're getting at there.

As for the possibility of a Single-Slot Cooler, who knows, but I'm well adjusted to not needing one. Usually only one or two AMD cards are specially designed by manufacturers to support single slot coolers, and those run pretty damn hot. It's just not cost-efficient these days. I don't see people complaining when AMD does it.

You're also forgetting the kicker, this was intended to be Nvidia's mid-range chip. They may not price it accordingly, but that was the original intent.
Posted on Reply
Add your own comment