Dynamic Overclocking IntroductionNVIDIA's GeForce GTX 680 introduces the biggest overclocking change since 2D/3D clocks.
So far, all graphics cards that have been released came with a specific clock frequency that was active during typical 3D rendering. There were some exceptions for when the card overheated or an application ran at extremely low load (like the Windows 7 Aero desktop), but overall 3D clocks were static and never changed.
Now NVIDIA is introducing a new concept called "Boost Clocks". These enable the GPU to run at increased frequency depending on a number of input factors that are used to determine whether it's a good idea to run at higher clocks or not. According to NVIDIA, their algorithm takes into account board power consumption, temperature, GPU load, memory load and more. With this information, it should be possible for the driver to make a decision whether a certain clock state is safe in terms of power usage and heat output. Obviously we don't want cards burning up due to automatic overclocking.
Instead of talking GPU clock, memory clock and shader clock (which are gone on Kepler) we will now talk about base clock, boost clock and memory clock.
- Base clock is the minimum guaranteed clock speed the card will run at in non-stress testing applications (Furmark, OCCT). For GeForce GTX 680 this is set to 1006 MHz.
- Boost clock is the average clock frequency the GPU will run under load during typical gaming. This is set to 1058 MHz for GTX 680. Please note that the actual clock frequency will often exceed the boost clock specification of 1058 MHz.
- Memory clock. This is unchanged and exactly the same as before, representing the speed the memory chips are running at.
As many overclockers know, increasing voltage increases overclocking potential. NVIDIA uses this to ensure that the GTX 680 will be stable over the whole range of (stock) GPU frequencies, which also increases yields as even GPUs that wouldn't normally make the clock qualification can now be used - at increased voltage.
For the following graphs we used a run of our benchmarking suite at 1920x1200 (to reduce time). GPU-Z was used to log both the GPU frequency and GPU voltage.
The following graph shows the result of this test at NVIDIA stock settings of 1006 MHz base clock and 1058 MHz boost clock. I gave each data point some transparency, so rarely used combinations are less visible, which helps guide the eye.
As you can see there is a clear correlation between GPU clock and voltage. When a higher GPU clock is selected by the driver, it will also pick an increased voltage to ensure maximum stability.
We can also observe that the typical range of clock speeds is between 1006 MHz and 1110 MHz. Voltages range from 1.075 V to 1.175 V. The voltage range will vary from card to card as Kepler uses a dynamic VID algorithm that selects a certain base voltage based on manufacturing properties of the GPU. For Fermi cards this ASIC Quality value can be read using GPU-Z (not supported on Kepler yet).
Please note that any increase in voltage will lead to increased power consumption and heat output, which requires that NVIDIA's clock algorithm monitors power and heat on a constant basis, to possibly lower the clock speeds again.
In the following testing we will investigate how the system works and check out what additional manual overclocking will do for the card.
Power LimitIntroduced first with NVIDIA's GeForce GTX 580, the power limiter is in full force again on the GTX 680. It can not be disabled according to NVIDIA and board partners must use it. In order to cater to overclockers, ASUS just left out the whole circuitry on their GTX 580 Matrix for example, whether this will be possible again with GTX 680 seems doubtful.
NVIDIA is using three INA219 power monitoring chips on their board, which provide the driver with power consumption numbers for 12V PCI-Express and the two 6-pin PCI-E power connectors. The driver then decides whether the board is running below or above the NVIDIA configured power limit. Depending on the result of the measurement, the dynamic overclocking algorithm will pick a new clock target roughly every 100 milliseconds (three frames at 30 FPS). AMD's power limiting system works slightly different. It does not measure any actual power consumption but relies on an estimate based on current GPU load (not just %, but taking more things into account). This makes the system completely deterministic and independent of any component tolerances and reduces board cost. AMD uses a hardware controller integrated in the GPU to update measurements and clocks many times per frame, in microsecond intervals, independent of any driver or software.
Each of these methods has its advantages and disadvantages. One thing that I like about NVIDIA's approach is that with a bit of hardware modding, the sensors might be tricked into thinking power is lower.
NVIDIA has set the adjustable range for the power limit to +32% up and -30% down. So far this is a hard limit, no software based method to circumvent it has been found.
I ran some tests with the power limiter configured at different levels to see how actual performance would change.
In the graph below, the red line shows the performance improvement (or reduction) of GTX 680 compared to the baseline value of 0% (black line). The green dotted line shows the average of the red line.
Adjusting the power limit up by its maximum yields 0.9% in real life performance. Take this in comparison to the -8.3% that we see when setting the power limit as low as possible.
The conclusion of these tests is that NVIDIA picked a decent default power limit on their board which doesn't limit typical gaming by much. However, there is still a bit of headroom to maximize performance (again, without any manual overclocking, using just the default always-on dynamic overclocking).
TemperatureIn the briefings NVIDIA mentioned that GPU temperature is taken into account for dynamic overclocking. Wait, what? Will the card run slower when it's running hot ? Yes it does.
The following graph shows how changes in GPU temperature affect the selected clock. We tested this with a static scene that renders the same scene each frame, resulting in constant GPU and memory load, which would otherwise affect this testing.
GPU clock is plotted on the vertical axis using the blue MHz scale on the left. Temperature is plotted on the vertical axis too, using the red °C scale on the right. Time is running on the horizontal axis.
We see a clearly visible downward step pattern on the clock frequency curve as the temperature increases. This is not a gradual change, but the steps happen at what looks like predefined values of 70°C, 80°C, 85°C, 95°C. Once temperature exceeds 98°C, thermal protection will kick in and GPU clock drops like a rock to 536 MHz and then even 176 MHz trying to save the card from overheating.
Each step is 13.5 MHz in size, which results in a total clock difference of 40 MHz going from below 70°C to 95°C - with the exact same rendering load, all happening transparently by the NVIDIA driver.
For end-users this means that to maximize dynamic overclocking potential, they would have to run at temperatures below 70°C. Otherwise they will end up with up to 40 MHz less if their card runs above 90°C. Even users who don't care about manual overclocking will have to consider this. The dynamic overclocking in the driver is always active and can not be turned off.
Performance now being based on temperature will pose an interesting challenge for system assemblers and case manufacturers as they will now have to focus even more on thermals, while still trying to keep noise levels acceptable. How will reviewers test their cards? With an open bench? a normal case? or a worst case [sic] ?
I ran some additional testing with the card's fan speed set to maximum, which results in much lower temperatures of the card, directly increasing performance (without any manual overclocking).
It looks like on average just setting the fan to 100% results in a 0.8% performance increase. Again, this is without any overclocking or other tweaking. 0.8% is not very significant, but it still shows that there is now another variable that needs to be considered when trying to maximize performance. It also means that cases with really bad ventilation will suffer from a (small) performance penalty when a GeForce GTX 680 is installed in the case.