• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Overclocked HBM? It's true, and it's fast

Status
Not open for further replies.
You are just speculating on all of that. Think about it for a minute. We actually have absolutely no idea why it may be performing lower than it could. We don't even know for sure that it isn't performing as well as it can, besides some stutters in some games, which may or may not mean it's running slower overall. :)
I do not believe this is speculation at all. I for one, enjoy overclocking, so am happy it is possible. Still though, I don't think there are real world gains possible.

HBM may be new, and we don't know much about it. What we do know is that it is faster than gddr5. A 290x had the same extremely 512 GB/s bandwidth. The 980ti has 337 GB/s of bandwidth. For years, vram overclocking has had minimal real world gains on gddr5. Why would it provide any on HBM? It servers the same function. Fury X HBM vram is just as fast as the 290x's Gddr5. We are fairly certain the memory speed is not an issue. Vram speeds are really only helpful at 4k at this time.

If Fury can perform better, it will come from drivers. The 7970 and the 290x had super drivers that increased performance by about 10-15% across the board a few months after release. The omega drivers were these. I have a suspicion that we will see amd do this again.

On a hardware level, an educated guess would say that rop count is the bottleneck holding fury back. It has ~45% more stream processors than the 290x but with the same 64 rops. ~20% increase in performance. That is not great scaling. It may have needed more rops to utilize those 4096 stream processors. This is only speculation.


The memory overclock doesn't help - it's already insanely fast for memory. Fury X doesn't need moar HBM. It needs faster core clocks.
Do you really think it is clock speed holding back the fury x? The 980ti has default clock of 1000, and a boost clock of 1076. Fury x is clocked at 1050. Clock for clock they are pretty similar in performance. If fury was clocked at 1076, that might make up for that 2% difference shown in w1zzards reviews.
 
Do you know that ref 980ti runs at 1200Mhz most of time?
 
Last edited:
i wonder how fast it is in FS Extreme. we know its pretty fast but when overclocked to high heaven.. :p

I do not believe this is speculation at all. I for one, enjoy overclocking, so am happy it is possible. Still though, I don't think there are real world gains possible.

HBM may be new, and we don't know much about it. What we do know is that it is faster than gddr5. A 290x had the same extremely 512 GB/s bandwidth. The 980ti has 337 GB/s of bandwidth. For years, vram overclocking has had minimal real world gains on gddr5. Why would it provide any on HBM? It servers the same function. Fury X HBM vram is just as fast as the 290x's Gddr5. We are fairly certain the memory speed is not an issue. Vram speeds are really only helpful at 4k at this time.

If Fury can perform better, it will come from drivers. The 7970 and the 290x had super drivers that increased performance by about 10-15% across the board a few months after release. The omega drivers were these. I have a suspicion that we will see amd do this again.

On a hardware level, an educated guess would say that rop count is the bottleneck holding fury back. It has ~45% more stream processors than the 290x but with the same 64 rops. ~20% increase in performance. That is not great scaling. It may have needed more rops to utilize those 4096 stream processors. This is only speculation.



Do you really think it is clock speed holding back the fury x? The 980ti has default clock of 1000, and a boost clock of 1076. Fury x is clocked at 1050. Clock for clock they are pretty similar in performance. If fury was clocked at 1076, that might make up for that 2% difference shown in w1zzards reviews.
i think they will release more optimized drivers for the fury card, someone needs to do a driver compare between fury x driver, and fury/nano drivers, it would be interesting to see the perf diff

i also believe the current drivers are un-optimized for lower than 4k gaming, just look at the omega 290x perf diff

ryse_1080.gif


ryse_2160.gif
 
Last edited:
TSMC makes the GPU cores for both Nvidia and AMD so it's just a case of AMDs design not being able to run higher clocks. Once voltage gets unlocked I expect some cards to be able to hit 1250mhz on water at 1.35V.
I doubt that highly every indication is telling us that the silicon just doesn't tolerate being run out of spec
I think what we are gonna start seeing is people will be burning there gpu's up left and right once they start fiddling with the voltage
and yes as per usual the drivers are trash ....
the FuryX was about AMD being first to market with on-package memory its a PR stunt more then anything GNC still sucks dicks. and there drivers are still a clusterFK of multiple code tree's and regressions AHOY
AMD will continue to slowly sink so long as they continue to operate the way they do
 
I doubt that highly every indication is telling us that the silicon just doesn't tolerate being run out of spec
I think what we are gonna start seeing is people will be burning there gpu's up left and right once they start fiddling with the voltage
and yes as per usual the drivers are trash ....
the FuryX was about AMD being first to market with on-package memory its a PR stunt more then anything GNC still sucks dicks. and there drivers are still a clusterFK of multiple code tree's and regressions AHOY
AMD will continue to slowly sink so long as they continue to operate the way they do
you dont know anything regarding amd drivers with that statement. there have been many nvidia crashes you must have forgotten, you have the 590 basicly burning up with wrong driver, you also forgot that nvidia was the biggest vendor that crashed on vista and that includes the huge amount of intel IGPs! etc

amd/ati drivers was buggy during the windows 9x/me days but those operating systems are ancient and everybody sucked on those regardless of vendor :p

and if a amd gpu driver crashed, it was mostly due to overclocking

FYI: everyone has multiplier driver trees, thats how they add new features.. look it up..
 
^To the guy above
Do you even know what we are talking about in this thread?

I'll run some benchmarks. As for clocks I'm pretty sure maxwell has a longer pipeline than gcn
I'm looking forward to your numbers :)
 
Last edited:
Also, most of the top 100 3DMark scores are made under LN2, with disabled tessellation, modded Videocard.
C'mon :)
Judging by Hardware.info's own testing, their GTX 980 Ti review yielded 19,359 on air with the exact same system as used for the Fury X's 16,963 ( their 980Ti score is under the Fury X's screenshot- albeit with a broken link - I have provided the correct one). One of my local computer sales outlets (PB Tech) racked up 19.858 using a 5960X (water) + Gigabyte GTX 980 Ti G1 (air) - basically the same system as Hardware.info used.

gigabyte-geforce-gtx-980-ti-g1-gaming-6gb.jpg


Nice that the HBM OC's but I'm still wondering why AMD discouraged reviewers from doing so.
 
@OP Those drivers aren't right dude,. I think u need to update them.

Aren't they the modded ones?



Try these:

http://support.amd.com/en-us/kb-articles/Pages/AMD-Radeon-300-Series.aspx



Did u get those from Priministor @ Guru? They have Shader 5.0 version number it should be 5.1.

http://forums.guru3d.com/showthread.php?t=400078&page=7


If they are from Guru, you got shafted.

That Quemikry guy is right, PM doesn't have the correct driver. It should be cfxx32 which asfaik is the D3D driver. That wouldn't surprise me at all.
I completely have no idea what are you trying to do here, maybe you are posting in the wrong thread?
 
500mhz
YqnkCuL.png

550mhz
AklOPAk.png


GTA5 using 4GB vmem
550mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 18.911589, 135.649765, 67.071381
Pass 1, 39.104492, 136.511185, 67.168938
Pass 2, 50.401340, 104.464287, 73.244118
Pass 3, 45.552242, 133.467422, 86.338333
Pass 4, 30.762289, 146.618347, 67.937256

500mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 19.770178, 134.201065, 67.623108
Pass 1, 32.177280, 81.928307, 66.564148
Pass 2, 39.716557, 104.432373, 70.212379
Pass 3, 51.638721, 141.080902, 88.367096
Pass 4, 25.761564, 156.650940, 67.926483

I don't really see any gain with oc memory however it does appear it could be increasing the minimum framerate
 
500mhz
YqnkCuL.png

550mhz
AklOPAk.png


GTA5 using 4GB vmem
550mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 18.911589, 135.649765, 67.071381
Pass 1, 39.104492, 136.511185, 67.168938
Pass 2, 50.401340, 104.464287, 73.244118
Pass 3, 45.552242, 133.467422, 86.338333
Pass 4, 30.762289, 146.618347, 67.937256

500mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 19.770178, 134.201065, 67.623108
Pass 1, 32.177280, 81.928307, 66.564148
Pass 2, 39.716557, 104.432373, 70.212379
Pass 3, 51.638721, 141.080902, 88.367096
Pass 4, 25.761564, 156.650940, 67.926483

I don't really see any gain with oc memory however it does appear it could be increasing the minimum framerate
Maybe the card doesn't have memory timing set up for 550MHz, therefore it doesn't scale. Did you try to OC the mem only or both core/mem OC? For Furyx they are on the same die btw.

I hope that there would be a 600MHz bench to valid the Hardware.info's results.
 
Last edited:
This is great news for people who play benchmarks. Which is nobody.

Show me a 20% increase in FPS in a modern non-Mantle game, and then I'll be impressed.
 
You missed the 555MHz on HBM didn't you? That is what we has been talking about in this thread, not the core.
No, I just don't think it gives enough improvement to warrant increasing the clocks of the HBM, when higher core clocks will improve benchmark scores more, at least for now until new drivers come out.
 
No, I just don't think it gives enough improvement to warrant increasing the clocks of the HBM, when higher core clocks will improve benchmark scores more, at least for now until new drivers come out.
The fact is hardware.info seems to have the best graphic score in fs for a single FuryX until now. Other sites's results didn't even come close to their numbers, mostly around 20% behind. The important info hear is their stock score is around the same with sites who used the same cpu as them.
 
It's not about FuryX but... Impressed enough? 30% FPS boost in Project Cars
[VIDEO]
nvidia will also get an increase so what's you're point again ?

The fact is hardware.info seems to have the best graphic score in fs for a single FuryX until now. Other sites's results didn't even come close to their numbers, mostly around 20% behind. The important info hear is their stock score is around the same with sites who used the same cpu as them.

Those scores don't reflect real world performance :rolleyes:
 
Last edited:
Well, on my HD7950, GPU readings for GPU and MEM sometiems jump to insane clocks in GPU-Z, like 100000MHz and then I also get like 4000 GB/s memory bandwidth XD

For me it tells me I have an HD 7970! :P
 
Since HBM is new technology/architecture and I think newer games or perhaps older ones too are not HBM-aware. Did someone test mining performance of Fury X? Maybe I'm wrong in posing this question: Is AMD Fury X bottlenecking even the Intel Cpu's because of its monstrous bandwidth?
 
500mhz
YqnkCuL.png

550mhz
AklOPAk.png


GTA5 using 4GB vmem
550mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 18.911589, 135.649765, 67.071381
Pass 1, 39.104492, 136.511185, 67.168938
Pass 2, 50.401340, 104.464287, 73.244118
Pass 3, 45.552242, 133.467422, 86.338333
Pass 4, 30.762289, 146.618347, 67.937256

500mhz
Frames Per Second (Higher is better) Min, Max, Avg
Pass 0, 19.770178, 134.201065, 67.623108
Pass 1, 32.177280, 81.928307, 66.564148
Pass 2, 39.716557, 104.432373, 70.212379
Pass 3, 51.638721, 141.080902, 88.367096
Pass 4, 25.761564, 156.650940, 67.926483

I don't really see any gain with oc memory however it does appear it could be increasing the minimum framerate

This what I expect from a memory overclock. Anyone that has done quite a bit of overclocking will know that memory overclocking will give you small gains, it's all about the core. (ahh so many "overclocks") ;)

Since HBM is new technology/architecture and I think newer games or perhaps older ones too are not HBM-aware. Did someone test mining performance of Fury X? Maybe I'm wrong in posing this question: Is AMD Fury X bottlenecking even the Intel Cpu's because of its monstrous bandwidth?

HBM-aware, lol, man does that sound familiar, remember Bulldozer? Remember how people were saying Windows wasn't aware of the new architecture and they patched it? Remember how it then beat Intel? Nope. Sorry.

HBM-aware is not a thing. If you say current games have no need for the massive increase in memory bandwidth and in turn is not receiving significant gains from HBM then I would agree that it's a definite possibility.

AMD needs to work on their core even more.
 
This what I expect from a memory overclock. Anyone that has done quite a bit of overclocking will know that memory overclocking will give you small gains, it's all about the core. (ahh so many "overclocks") ;)



HBM-aware, lol, man does that sound familiar, remember Bulldozer? Remember how people were saying Windows wasn't aware of the new architecture and they patched it? Remember how it then beat Intel? Nope. Sorry.

HBM-aware is not a thing. If you say current games have no need for the massive increase in memory bandwidth and in turn is not receiving significant gains from HBM then I would agree that it's a definite possibility.

AMD needs to work on their core even more.
Overclocking memory isn't like the core when you see performance gain with any difference clock. Most of BIOS have a specific memory timing profile, which related to specific memory clocks. If you clock too far of those clock you would have bad delay with nullify the performance gain. OC memory is all about finding the highest sweet spot possible.

So in this case 550MHz isn't the sweet spot for the HBM on @v12dock 's card. Unfortunately he hasn't find the stable number higher than that. However 600 MHz may still be a sweet spot with significant boost in performance.
 
Overclocking memory isn't like the core when you see performance gain with any difference clock. Most of BIOS have a specific memory timing profile, which related to specific memory clocks. If you clock too far of those clock you would have bad delay with nullify the performance gain. OC memory is all about finding the highest sweet spot possible.

So in this case 550MHz isn't the sweet spot for the HBM on @v12dock 's card. Unfortunately he hasn't find the stable number higher than that. However 600 MHz may still be a sweet spot with significant boost in performance.

This is simply not true. Tweaking timings will give you a performance boost by reducing latency but the boost is even smaller than the memory overclock. There may be a sweet spot like you say but it's not going to give you any miracle performance numbers they will be minute and insignificant. You realize we're talking nano seconds? Delay, really? I am not sure where you get your information but this delay is not a big issue anymore and certainly isn't for extremely low latency HBM stacks .. I would like to know how much of this you have experienced by doing overclocking yourself..

I want you to realize that when you overclock the core you get a direct increase of each and every one of the shader cores (stream processors) because the core clock is linked 1:1 with the clock of the shaders. It is the same on Maxwell and Kepler with Fermi having a 1:2 core/sharder clock. So on the Fury X, with it's 4096 cores you will get a combined 409.6Ghz increase of calculation power if you can manage a 100Mhz overclock on the core.

Like I said it's all about the core. With HBM giving the Fury X massive VRAM bandwidth an overclock/timing tweak will give you very little performance, not enough to justify overclocking a brand new technology that may not respond well to it. The VRAM is not the bottleneck.

To give a car analogy. If you have a V10 in a Civic and you want to go faster you don't swap a W16 engine in or add Twin Turbos to the V12. You improve the area that's already at it's max, transmission, tires, drivetrain, etc..
 
This is simply not true. Tweaking timings will give you a performance boost by reducing latency but the boost is even smaller than the memory overclock. There may be a sweet spot like you say but it's not going to give you any miracle performance numbers they will be minute and insignificant. You realize we're talking nano seconds? Delay, really? I am not sure where you get your information but this delay is not a big issue anymore and certainly isn't for extremely low latency HBM stacks .. I would like to know how much of this you have experienced by doing overclocking yourself..

I want you to realize that when you overclock the core you get a direct increase of each and every one of the shader cores (stream processors) because the core clock is linked 1:1 with the clock of the shaders. It is the same on Maxwell and Kepler with Fermi having a 1:2 core/sharder clock. So on the Fury X, with it's 4096 cores you will get a combined 409.6Ghz increase of calculation power if you can manage a 100Mhz overclock on the core.

Like I said it's all about the core. With HBM giving the Fury X massive VRAM bandwidth an overclock/timing tweak will give you very little performance, not enough to justify overclocking a brand new technology that may not respond well to it. The VRAM is not the bottleneck.

To give a car analogy. If you have a V10 in a Civic and you want to go faster you don't swap a W16 engine in or add Twin Turbos to the V12. You improve the area that's already at it's max, transmission, tires, drivetrain, etc..
I agree with your car analogy. However, are you sure you know how Fiji GPU uses HBM which lie on the same interposer?? This architecture is unprecedented and I doubt that anyone in this thread fully knows how it works thoroughly.

Could you please try to explain that 19321 graphics score in fs, when the oc was 1145/600. FYI the graphics score for 1145/500 is around 16k only.
 
Last edited:
Status
Not open for further replies.
Back
Top