Difficulty finding stable undervolt/overclock of Vega 56

NovaProspekt · Jul 16, 2018

Hello everyone,

As the title states, I have been having a very hard time getting my card to run stable. I have the Gigabyte reference model Vega 56, which I have flashed to the Vega 64 BIOS. I have been scouring forums for months looking at Vega undervolting/overclocking guides, and pretty much everyone seems to be having great success.

My games crash to desktop even at the "stock" (Vega 64 BIOS default) Wattman settings. It seems like no matter how I go about trying to find a stable voltage/frequency combination I cannot get the card to run stable. I have been using GPU-Z to monitor the card and the core temp never exceeds 70C (though the "hot spot" temp often reaches 105C).

Here is the latest screenshot of what my Wattman setting look like. I have the P6 and P7 states set to the same values because I am trying the most recent guide advice I found, which is to set the P6 and P7 states to the P6 default values, then gradually decrease voltages by the same amount on both states together to find the optimum P6 undervolt, then increase P7 to the default values and start lowering P7 voltage. However, even with P6 and P7 set to the P6 defaults, Destiny 2 crashes to desktop within minutes. Have I just gotten really unlucky and ended up with a chip that barely passed AMD quality control?

I am starting to wonder if maybe the problem might be my CPU overclock (i5 3570k @ 4.4 Ghz), but that's been running solid since 2012 so I'm pretty confident in it. I tried running some Prime95 tests just to reassure myself, but it's summer and there's no AC in my office so Prime95 quickly puts the CPU over 90 degrees C (temps monitored via CoreTemp), which I don't want to sustain. The temps never get much over 60 degrees C due to a real world gaming load.

Any advice would be very welcomed.

Thanks

Probably the answer is to flash back to the stock Vega 56 BIOS and see if the stability problems go away right?

HD64G · Jul 16, 2018

I agree that if you go back to the Vega 56 bios you have installed it might get much more stable. Vega 64 is different mainly in HBM2 origin and thus you might have problem working that out easily. SOme around here might know better though.

NovaProspekt · Jul 16, 2018

Well, I flashed back to the stock BIOS that came with the card. I've been playing for about 2 hours, and suddenly got a CTD. I had Wattman set to "custom", but I had not adjusted any settings. So, taking this into account would you say the most likely scenario is that my CPU overclock is to blame?

I guess the next step is to start backing off my CPU multiplier

INSTG8R · Jul 16, 2018

NovaProspekt said:
Well, I flashed back to the stock BIOS that came with the card. I've been playing for about 2 hours, and suddenly got a CTD. I had Wattman set to "custom", but I had not adjusted any settings. So, taking this into account would you say the most likely scenario is that my CPU overclock is to blame?

I guess the next step is to start backing off my CPU multiplier

One thing at a time. You’ll never know the actual issue if your changing multiple variables.

NovaProspekt · Jul 16, 2018

Is it possible the card could be thermal throttling and shutting down at the stock voltage? Or is it a safe assumption that the card should be guaranteed to not crash at stock settings?

I guess my thought process is, if games are crashing with the core and memory clocks at stock settings, is it a safe assumption that the problem lies elsewhere?

mtcn77 · Jul 16, 2018

Don't think you will avoid damage by undervolting. That is just what I did. Less voltage = more current demand. You are only making it more resistive to a core temperature load, but it will eventually pull the same watt at a higher resistive state from the vrm. A higher voltage will make the core more runny and lessen the load on vrms since they won't have to generate more amperage.

INSTG8R · Jul 16, 2018

NovaProspekt said:
Is it possible the card could be thermal throttling and shutting down at the stock voltage? Or is it a safe assumption that the card should be guaranteed to not crash at stock settings?

I guess my thought process is, if games are crashing with the core and memory clocks at stock settings, is it a safe assumption that the problem lies elsewhere?

It should mostly definitely be fine at stock if not it’s definitely an issue. You should be doing some form of monitoring while using it to determine if it’s throttling. AMD at least made it easy. Cntrl+Shift+O will bring up the built in overlay.

RatusNatus · Jul 16, 2018

First thing is Wattman is bad.
I think u should go back to 64 bios, since is the very same thing for reference Vega card and raise the FAN min speed.

Without the 64 bios you cant OC HBM and you do have the very same Samsung HBM chip as the 64.
You need higher fan speed cuz HBM cant keep up above 70. And the Memory runs hotter than CORE so set MAX temp to 65 and see how it goes.

NovaProspekt · Jul 16, 2018

I was monitoring with GPUZ the whole time. Core temp never exceeded 72 degrees C, but the core clock would fall down into the low 1400s when it heated up. It is my understanding that that is why people under volt Vega, not to save power but to prevent the core from heating up and throttling the clock speed down. And I did have the max fan speed set to 3500 rpm when it crashed, which is pushing the limits of what I would consider tolerable noise wise.

Ratus I would rather be running the 64 BIOS but I think I should stick with the stock BIOS until I figure out what is causing the instabilities

RatusNatus · Jul 16, 2018

Im a miner. I do have 6 Vegas 64 running 100% 24/7 but with lower CORE speed(1408) and with HMB OC both undervolted. Im running all 6 with Core 1408hz 1100mv, HBM 1100hz 900mv.

72 is a no go temp to core. at this speed my mining thing goes down. My fan is 3000rpm min not max. I think u should try MSI Afterburner and set the fan there. Forget about Wattman. Its a crap software.
Any core temp above 70 will drop HMB HBCC...i dont know what this is but it makes the software hang and sometimes windows BSOD.

The 56 bios will limit your HBM frequency to around 950. Like i said, its the same chip and will do 1100 fine with 64 bios WITH undervolt.

NovaProspekt · Jul 16, 2018

So you think my issues are caused by overheating and I just need to push the fan speed higher? I was hoping not to need to do that, 3500rpm is already pretty intrusive.

Maybe I’ll try maxing the fan speed and see if that improves stability

HD64G · Jul 16, 2018

Test your cpu-ram combo stability for now. 1 hour with prime95 is enough imho.

londiste · Jul 16, 2018

Remember that HBM temps are usually in the range of core temp +15C. And HBM throttles at 85C. When running HBM at higher clocks (say, Vega 64 clocks), even more so.
At least that used to be the way this worked around the launch.

NovaProspekt · Jul 16, 2018

Shoot so maybe I rolled back from the v64 BIOS for nothing....

INSTG8R · Jul 17, 2018

NovaProspekt said:
Shoot so maybe I rolled back from the v64 BIOS for nothing....

Again sort one issue at a time. Once you know your CPU overclock is stable the you can sort your GPU. Also are reinstalling drivers between all these changes. I know my PC freaked out just flipping the BIOS switch on my Vega.

Vario · Jul 17, 2018

Set your CPU stock until you figure out the card.

MrGenius · Jul 17, 2018

mtcn77 said:
Don't think you will avoid damage by undervolting...Less voltage = more current demand. You are only making it more resistive to a core temperature load, but it will eventually pull the same watt at a higher resistive state from the vrm. A higher voltage will make the core more runny and lessen the load on vrms since they won't have to generate more amperage.

All of that you're saying there = wrong. That is not how it works. Nice job making it sound like you know what you're talking about though.

Totally · Jul 17, 2018

@OP If you are undervolting there is absolutely no reason for the 64 bios. The point of the bios is for the higher limits, with undervolting you're going in the opposite direction.

MrGenius said:
All of that you're saying there = wrong. That is not how it works. Nice job making it sound like you know what you're talking about though.

It's like saying if you hit the brakes your car is going to accelerate, he's beyond just wrong. If voltage is decreasing, current is also proportionately decreasing. Been awhile since I've seen someone so confused about simple IVR. I guess, if resistance magically decreases with an increasing temp and it isn't a short, current can go up.

NovaProspekt · Jul 17, 2018

Thanks for all the suggestions, looking forward to giving it a try tomorrow

eidairaman1 · Jul 17, 2018

INSTG8R said:
It should mostly definitely be fine at stock if not it’s definitely an issue. You should be doing some form of monitoring while using it to determine if it’s throttling. AMD at least made it easy. Cntrl+Shift+O will bring up the built in overlay.

Put everything back to stock i say for now, if no crashes then it was an oc he had

mtcn77 · Jul 17, 2018

MrGenius said:
All of that you're saying there = wrong. That is not how it works. Nice job making it sound like you know what you're talking about though.

Why are they running the fans at 100%? Vega Undervolting.

NovaProspekt · Jul 17, 2018

Here is what I've done since yesterday:

The first thing I tried this morning was picking the default "Turbo" profile in Wattman just to see how the card would run at completely stock settings. It ran at a stable core clock in the low 1300s, which I found to be a bit disappointing since I see most people in forums running these at 1500+ for gaming. However, it was nice and quiet as the fan speed never exceeded 2500 rpm.

Next, I went back to the "Custom" profile. I left all the clock and voltage settings alone, but increased the power limit slider to +50%, set the target temp slider to 60 degrees C (I left the max at the default 85), and increased the max fan speed to 4900 rpm (the max Wattman will allow). I played some Destiny 2 at this point with GPU-Z open on my second monitor and saw that the core was maintaining about 1500-1520 with occasional dips into the upper 1400s. The fan was running at max. Core temp and HBM temps were right around 65-67 degrees C. But then I noticed that the value GPU-Z calls "core temp (hot spot)" was sometimes getting up to 108 degrees C. That seemed alarmingly high. I started searching and came to the Tom's Hardware article about undervolting Vega 64. They reached out to the developer of GPU-Z for clarification on how this "hot spot" value is measured and came to the conclusion that they are confident it is an accurate value. They also state that the capacitors used in the reference design Vega boards have a max operating temperature of 105 degrees C, and theorize that it is the "hot spot" temperature that triggers the card to throttle back core clock.

At this point I started manually decreasing core voltage on the P6 and P7 states in an attempt to get the hot spot temp down. I now have both states set at 1000 mV (down from the default P6 and P7 values of 1150 mV and 1200 mV, respectively). At this voltage, the card sustains a core clock of about 1450 Mhz while gaming. Core temperature and HBM temperature stay between 55-60 degrees C, and the hot spot temperature now tops out in the mid to upper 80s. Power limit slider is still at +50%, fan speed at maximum. I have not touched any memory settings.

I have not experienced a crash since increasing max allowable fan speed to 4900 rpm, so maybe that was part of my problem. Also, I remember seeing hot spot temps above 100 degrees C in GPU-Z when I was running the Vega 64 BIOS too. I feel much better about 80s.

I think I will see how stability is with these settings for a little while, and then consider trying to push the frequencies a little higher while maintaining the 1000mV voltage on the core.

One last little aside, I am now noticing that the card's memory frequency is staying at 800 Mhz all the time, even after closing all programs and just idling at the desktop. The core clock comes down, and is currently 30 Mhz as I type this, but the memory frequency is still 800 Mhz as reported by both GPU-Z and Wattman. This persists even after a restart. With the Vega 64 BIOS, memory clock used to drop down into an idle state as well. I don't know if it's anything worth worrying about. Hopefully the BIOS didn't get corrupted during the flashing process.

TheoneandonlyMrK · Jul 17, 2018

NovaProspekt said:
Here is what I've done since yesterday:

The first thing I tried this morning was picking the default "Turbo" profile in Wattman just to see how the card would run at completely stock settings. It ran at a stable core clock in the low 1300s, which I found to be a bit disappointing since I see most people in forums running these at 1500+ for gaming. However, it was nice and quiet as the fan speed never exceeded 2500 rpm.

Next, I went back to the "Custom" profile. I left all the clock and voltage settings alone, but increased the power limit slider to +50%, set the target temp slider to 60 degrees C (I left the max at the default 85), and increased the max fan speed to 4900 rpm (the max Wattman will allow). I played some Destiny 2 at this point with GPU-Z open on my second monitor and saw that the core was maintaining about 1500-1520 with occasional dips into the upper 1400s. The fan was running at max. Core temp and HBM temps were right around 65-67 degrees C. But then I noticed that the value GPU-Z calls "core temp (hot spot)" was sometimes getting up to 108 degrees C. That seemed alarmingly high. I started searching and came to the Tom's Hardware article about undervolting Vega 64. They reached out to the developer of GPU-Z for clarification on how this "hot spot" value is measured and came to the conclusion that they are confident it is an accurate value. They also state that the capacitors used in the reference design Vega boards have a max operating temperature of 105 degrees C, and theorize that it is the "hot spot" temperature that triggers the card to throttle back core clock.

At this point I started manually decreasing core voltage on the P6 and P7 states in an attempt to get the hot spot temp down. I now have both states set at 1000 mV (down from the default P6 and P7 values of 1150 mV and 1200 mV, respectively). At this voltage, the card sustains a core clock of about 1450 Mhz while gaming. Core temperature and HBM temperature stay between 55-60 degrees C, and the hot spot temperature now tops out in the mid to upper 80s. Power limit slider is still at +50%, fan speed at maximum. I have not touched any memory settings.

I have not experienced a crash since increasing max allowable fan speed to 4900 rpm, so maybe that was part of my problem. Also, I remember seeing hot spot temps above 100 degrees C in GPU-Z when I was running the Vega 64 BIOS too. I feel much better about 80s.

I think I will see how stability is with these settings for a little while, and then consider trying to push the frequencies a little higher while maintaining the 1000mV voltage on the core.

One last little aside, I am now noticing that the card's memory frequency is staying at 800 Mhz all the time, even after closing all programs and just idling at the desktop. The core clock comes down, and is currently 30 Mhz as I type this, but the memory frequency is still 800 Mhz as reported by both GPU-Z and Wattman. This persists even after a restart. With the Vega 64 BIOS, memory clock used to drop down into an idle state as well. I don't know if it's anything worth worrying about.

First thing I noticed on post one is target temp set to 55, the card Will throttle to run at that if you're telling it too.
You have some say over noise by turning up the fan.
You have some say on what temp to run it at.
You have some say what speed it runs at.

But you can't lower all of them and push clocks up.
The card will try to run at the noise and temperature level you set above all else including clocks.

Secondly the card is fine upto 85 and 95 is it's real top end pre hardware temp prochot throttle so why are you trying for lower anyway.

Third the hotspots on chip ,any caps aren't likely to be at that temp or on the chip? or will be solid and fine with it.

To me the cards heatsink is possibly not seated well or more likely your case has a restrictive airflow issue and your not giving the card enough air at ambient.

NovaProspekt · Jul 17, 2018

I don't think my case airflow should be a problem. I have 2 front intake fans, 2 side intake fans, 1 bottom intake fan, 2 top exhaust fans, 1 rear exhaust fan all running 100% all the time.

eidairaman1 · Jul 17, 2018

Take a picture of you case inside, take a picture of your home thermostat currently.

Post them here...

This is starting to go around in circles, do you want help or not?

Processor	AMD Ryzen 5 5600@80W
Motherboard	MSI B550 Tomahawk
Cooling	ZALMAN CNPS9X OPTIMA
Memory	2*8GB PATRIOT PVS416G400C9K@3733MT_C16
Video Card(s)	Sapphire Radeon RX 6750 XT Pulse 12GB
Storage	Sandisk SSD 128GB, Kingston A2000 NVMe 1TB, Samsung F1 1TB, WD Black 10TB
Display(s)	AOC 27G2U/BK IPS 144Hz
Case	SHARKOON M25-W 7.1 BLACK
Audio Device(s)	Realtek 7.1 onboard
Power Supply	Seasonic Core GC 500W
Mouse	Sharkoon SHARK Force Black
Keyboard	Trust GXT280
Software	Win 7 Ultimate 64bit/Win 10 pro 64bit/Manjaro Linux

System Name	Hellbox 5.1(same case new guts)
Processor	Ryzen 7 5800X3D
Motherboard	MSI X570S MAG Torpedo Max
Cooling	TT Kandalf L.C.S.(Water/Air)EK Velocity CPU Block/Noctua EK Quantum DDC Pump/Res
Memory	2x16GB Gskill Trident Neo Z 3600 CL16
Video Card(s)	Powercolor Hellhound 7900XTX
Storage	970 Evo Plus 500GB 2xSamsung 850 Evo 500GB RAID 0 1TB WD Blue Corsair MP600 Core 2TB
Display(s)	Alienware QD-OLED 34” 3440x1440 144hz 10Bit VESA HDR 400
Case	TT Kandalf L.C.S.
Audio Device(s)	Soundblaster ZX/Logitech Z906 5.1
Power Supply	Seasonic TX~1000 Platinum
Mouse	G502 Hero
Keyboard	G19s
VR HMD	Oculus Quest 3
Software	Win 11 Pro x64

System Name	Hellbox 5.1(same case new guts)
Processor	Ryzen 7 5800X3D
Motherboard	MSI X570S MAG Torpedo Max
Cooling	TT Kandalf L.C.S.(Water/Air)EK Velocity CPU Block/Noctua EK Quantum DDC Pump/Res
Memory	2x16GB Gskill Trident Neo Z 3600 CL16
Video Card(s)	Powercolor Hellhound 7900XTX
Storage	970 Evo Plus 500GB 2xSamsung 850 Evo 500GB RAID 0 1TB WD Blue Corsair MP600 Core 2TB
Display(s)	Alienware QD-OLED 34” 3440x1440 144hz 10Bit VESA HDR 400
Case	TT Kandalf L.C.S.
Audio Device(s)	Soundblaster ZX/Logitech Z906 5.1
Power Supply	Seasonic TX~1000 Platinum
Mouse	G502 Hero
Keyboard	G19s
VR HMD	Oculus Quest 3
Software	Win 11 Pro x64

Processor	AMD Ryzen 5 5600@80W
Motherboard	MSI B550 Tomahawk
Cooling	ZALMAN CNPS9X OPTIMA
Memory	2*8GB PATRIOT PVS416G400C9K@3733MT_C16
Video Card(s)	Sapphire Radeon RX 6750 XT Pulse 12GB
Storage	Sandisk SSD 128GB, Kingston A2000 NVMe 1TB, Samsung F1 1TB, WD Black 10TB
Display(s)	AOC 27G2U/BK IPS 144Hz
Case	SHARKOON M25-W 7.1 BLACK
Audio Device(s)	Realtek 7.1 onboard
Power Supply	Seasonic Core GC 500W
Mouse	Sharkoon SHARK Force Black
Keyboard	Trust GXT280
Software	Win 7 Ultimate 64bit/Win 10 pro 64bit/Manjaro Linux

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

Difficulty finding stable undervolt/overclock of Vega 56

NovaProspekt

New Member

HD64G

NovaProspekt

New Member

INSTG8R

Vanguard Beta Tester

NovaProspekt

New Member

mtcn77

INSTG8R

Vanguard Beta Tester

RatusNatus

NovaProspekt

New Member

RatusNatus

NovaProspekt

New Member

HD64G

londiste

NovaProspekt

New Member

INSTG8R

Vanguard Beta Tester

Vario

MrGenius

Totally

NovaProspekt

New Member

eidairaman1

The Exiled Airman

mtcn77

NovaProspekt

New Member

TheoneandonlyMrK

NovaProspekt

New Member

eidairaman1

The Exiled Airman

System Name	Computer of Theseus
Processor	Intel i9-12900KS: 50x Pcore multi @ 1.18Vcore (target 1.275V -100mv offset)
Motherboard	EVGA Z690 Classified
Cooling	Noctua NH-D15S, 2xSF MegaCool SF-PF14, 4xNoctua NF-A12x25, 3xNF-A12x15, AquaComputer Splitty9Active
Memory	G-Skill Trident Z5 (32GB) DDR5-6000 C36 F5-6000J3636F16GX2-TZ5RK
Video Card(s)	ASUS PROART RTX 4070 Ti-Super OC 16GB, 2670MHz, 0.93V
Storage	1x Samsung 990 Pro 1TB NVMe (OS), 2x Samsung 970 Evo Plus 2TB (data), ASUS BW-16D1HT (BluRay)
Display(s)	Dell S3220DGF 32" 2560x1440 165Hz Primary, Dell P2017H 19.5" 1600x900 Secondary, Ergotron LX arms.
Case	Lian Li O11 Air Mini
Audio Device(s)	Audiotechnica ATR2100X-USB, El Gato Wave XLR Mic Preamp, ATH M50X Headphones, Behringer 302USB Mixer
Power Supply	Super Flower Leadex Platinum SE 1000W 80+ Platinum White, MODDIY 12VHPWR Cable
Mouse	Zowie EC3-C
Keyboard	Vortex Multix 87 Winter TKL (Gateron G Pro Yellow)
Software	Win 10 LTSC 21H2

System Name	Miami
Processor	Ryzen 3800X
Motherboard	Asus Crosshair VII Formula
Cooling	Ek Velocity/ 2x 280mm Radiators/ Alphacool fullcover
Memory	F4-3600C16Q-32GTZNC
Video Card(s)	XFX 6900 XT Speedster 0
Storage	1TB WD M.2 SSD/ 2TB WD SN750/ 4TB WD Black HDD
Display(s)	DELL AW3420DW / HP ZR24w
Case	Lian Li O11 Dynamic XL
Audio Device(s)	EVGA Nu Audio
Power Supply	Seasonic Prime Gold 1000W+750W
Mouse	Corsair Scimitar/Glorious Model O-
Keyboard	Corsair K95 Platinum
Software	Windows 10 Pro

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506