• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Hard resets playing games

flashfrenzy

New Member
Joined
Mar 22, 2021
Messages
19 (0.01/day)
I just finished building a new system this last week and twice now I've had hard resets playing Shadow of the Tomb Raider. In one case it was after an hour or two and the second time it was within ~15 min. The second time I had both hwinfo and gpu-z logging running. Everything seems fine from the logs (links below) but maybe someone can see something I don't. I don't think the restart is thermal related. The CPU was at around 50C at the time of the restart. In games it seems to be around 50-60C and idle in the low 30s. Since the restarts I completely cleared the CMOS and went 100% stock (no XMP). Since then, I haven't seen any restarts but I've been running prime95 tests and always run into errors within 2 hours. The CPU does get very hot in the tests (up to 90C) but I'm not sure if this is typical or not for prime95 since my normal load temps seem fine. I've also run memtest86 with XMP enabled and FCLK at 1800 and 3 passes (3 and half hours) there were no errors so I don't suspect the RAM at this point. My last runs of P95 I had the case open with a giant fan blowing in. It didn't really seem to affect anything. My GPU temps were way down (irrelevant, I know) but CPU was unaffected. I'm kind of at a loss of what to look for or do at this point. Any suggestions/advice would be greatly appreciated.

Component rundown: 5800x, ASUS b550-f mobo, EVGA RTX3060, Arctic Freezer 280 AIO, Super Flower Leadex III 850W, 32 GB Crucial Ballistix CL16 3600

BIOS and drivers are 100% up to date.

hwinfo logs: https://docs.google.com/spreadsheets/d/1ggHi0jCAs5BACIKK_BJT7cCELAP9mVAvYHBBip8FtSk/edit?usp=sharing
gpu-z logs: https://docs.google.com/spreadsheets/d/1Fc2VbCaYCh_FgW8JDO1Q9o771ZlAXU5dhhazVmOR5_Y/edit?usp=sharing
 
Hard restarts may indicate a power issue.
Check Windows error logs. They may contain clues on what caused the shutdown.
Errors in prime95 mean that the CPU makes mistakes when it's processing instructions. This should be fixed or the system probably won't be able to work reliably. This could be caused by incorrectly configured CPU/RAM/mainboard or CPU/RAM/mainboard fault.
 
Thanks, yeah there's absolutely nothing in the windows event viewer, unfortunately. I guess what other options are there aside from just RMA'ing components until it stops? I'm just not sure where to start. I guess RAM is the easiest but seems like the least likely source at this point.
 
Nothing in event viewer means that it can be a power issuer, or something else. So this information did not really help to pinpoint the issue.
I would try other CPU stability testing software to see if you get errors in other software or just in prime95. I'm not sure which ones are good now. I guess OCCT may still be good.
You can also check if the CPU voltage and all other data points in an app like hwinfo64 look good.
I'm not at all familiar with the Ryzen platform so the above is just generic advice really. Someone who knows more about Ryzen can probably assist better.
 
I've been running prime95 tests and always run into errors within 2 hours.
That's classic of a manual CPU core OC with not enough Vcore.
 
32 GB Crucial Ballistix CL16 3600

That's 2 sticks or one stick? Give us the exact part number. If they are 2x16 GB sticks they can be single or dual rank. Use Zen Timings: https://zentimings.protonrom.com/ and post screenshot.

I've had hard resets with nothing in event log when I was running an unstable ram OC on my 3700x. Memtest86 wouldn't show a thing but running https://www.techpowerup.com/download/techpowerup-memtest64/ pretty much guaranteed a hard reset.
 
Here's the timings (with DOCP enabled). I unfortunately don't know enough about Ryzen memory overclocking yet to know what is normal and what isn't. These unfortunately are single rank, not dual rank as Crucial recently changed that for the 2x16GB sticks. Thanks for the tip on memtest64, I'll give that a shot.

ZenTimings_Screenshot_26941888.5416662.png
 
Your MCLK, FCLK and UCLK values should all be the same. I doubt that's what's causing it, but you should rectify that. It means that the board failed to boot with unified fabric & memory clocks. That will introduce unnecessary amounts of latency during gaming. Either that or it didn't attempt to do it in the first place (which I doubt as I have the same board without WiFi).

Try setting FCLK yourself manually to 1800 in the BIOS.

I would also suggest setting:
- VDDG CCD to 1.050v
- VDDG IOD to 0.950v
- CLDO VDDP to 0.900v
- DRAM voltage to 1.35v if it isn't already set that way
(all of these are under Ai Tweaker)

... and disabling Global C-State control under Advanced --> AMD CBS (this may help with the resets, it helped with my random idle reboots but I'm not sure)

No worries, you won't break anything by doing this. It'll either fix the affected 3 clocks mentioned above or not fix anything at all, in which case you can easily reset them back.

For example, this is how mine looks like:

HCQE8Ow.png
 
Last edited by a moderator:
Yes, I actually just noticed that myself. It was from me messing around in the BIOS yesterday and not properly setting everything back to default. Here's what it has been in the past when experiencing resets.
ZenTimings_Screenshot_26941910.7752197.png


And thanks for the other tips, I will give those a shot as well.
 
Yes, I actually just noticed that myself. It was from me messing around in the BIOS yesterday and not properly setting everything back to default. Here's what it has been in the past when experiencing resets.View attachment 193531
We can rule that out then.

Still, looking at your SoC voltage, I would recommend you adjust the voltages as I posted above. The voltage on the VDDG CCD is too little while it's too high on the VDDG IOD. VDDP is fine.

Also consider disabling C-States.
 
@flashfrenzy What BIOS version are you running on that board?
 
Yes, I actually just noticed that myself. It was from me messing around in the BIOS yesterday and not properly setting everything back to default. Here's what it has been in the past when experiencing resets.View attachment 193531

And thanks for the other tips, I will give those a shot as well.

That VDDG IOD voltage looks way too high, could be causing issues. With RAM at 3600 you should be able to lower those voltages quite a bit. Something like:
CLDO VDDP: 0.800V
VDDG IOD: 0.850V
VDDG CCD: 0.900V (could run this higher if needed)
VSOC looks ok for those speeds.

@Alexa :s values on earlier post should work aswell. Should also try to find ProcODT etc settings for your Micron B-dies, should help with stability.
 
I would not go down in the 0.800v range since that caused me audio stuttering issues. The voltages I inputted should be fine for 3600 MT/s RAM.
 
... and disabling Global C-State control under Advanced --> AMD CBS (this may help with the resets, it helped with my random idle reboots but I'm not sure)

No worries, you won't break anything by doing this. It'll either fix the affected 3 clocks mentioned above or not fix anything at all, in which case you can easily reset them back.
You shouldn't ever have to disable c-states or any of the power saving features. If one has issues with them enabled it's not the feature at fault obviously and the real issue lies elsewhere.
 
I just finished building a new system this last week and twice now I've had hard resets playing Shadow of the Tomb Raider. In one case it was after an hour or two and the second time it was within ~15 min. The second time I had both hwinfo and gpu-z logging running. Everything seems fine from the logs (links below) but maybe someone can see something I don't. I don't think the restart is thermal related. The CPU was at around 50C at the time of the restart. In games it seems to be around 50-60C and idle in the low 30s. Since the restarts I completely cleared the CMOS and went 100% stock (no XMP). Since then, I haven't seen any restarts but I've been running prime95 tests and always run into errors within 2 hours. The CPU does get very hot in the tests (up to 90C) but I'm not sure if this is typical or not for prime95 since my normal load temps seem fine. I've also run memtest86 with XMP enabled and FCLK at 1800 and 3 passes (3 and half hours) there were no errors so I don't suspect the RAM at this point. My last runs of P95 I had the case open with a giant fan blowing in. It didn't really seem to affect anything. My GPU temps were way down (irrelevant, I know) but CPU was unaffected. I'm kind of at a loss of what to look for or do at this point. Any suggestions/advice would be greatly appreciated.

Component rundown: 5800x, ASUS b550-f mobo, EVGA RTX3060, Arctic Freezer 280 AIO, Super Flower Leadex III 850W, 32 GB Crucial Ballistix CL16 3600

BIOS and drivers are 100% up to date.

hwinfo logs: https://docs.google.com/spreadsheets/d/1ggHi0jCAs5BACIKK_BJT7cCELAP9mVAvYHBBip8FtSk/edit?usp=sharing
gpu-z logs: https://docs.google.com/spreadsheets/d/1Fc2VbCaYCh_FgW8JDO1Q9o771ZlAXU5dhhazVmOR5_Y/edit?usp=sharing
welcome to the club of broken AMD CPUs.

RMA it. i went through this a few times now with a bunch of CPUs. (my 5900X was not stable at stock speeds.)

do you have any cache hierachy related errors in the eventviewer after the crashes?
 
I've had hard resets with nothing in event log when I was running an unstable ram OC on my 3700x.
So you would get power-loss-like-symptoms, which usually would mean a bad VRM IC, MOSFET(s) or bad cap(s). That's spooky! Reminds me of one day when an Acer LCD monitor that I had, was power-cycling and then I realized that the power went out for a second multiple times! (power problems during that time and wasn't sure if the power utility company noticed or not)

More likely, I would suspect Windows crashing and rebooting, but unable to even keep a record of a bugcheck in the event log. If Windows failed to log a hardware-related bugcheck, then it will look like a flaky power source!
 
Last edited:
Since the restarts I completely cleared the CMOS and went 100% stock (no XMP). Since then, I haven't seen any restarts but I've been running prime95 tests and always run into errors within 2 hours. The CPU does get very hot in the tests (up to 90C) but I'm not sure if this is typical or not for prime95 since my normal load temps seem fine. I've also run memtest86 with XMP enabled and FCLK at 1800 and 3 passes (3 and half hours) there were no errors so I don't suspect the RAM at this point. My last runs of P95 I had the case open with a giant fan blowing in. It didn't really seem to affect anything. My GPU temps were way down (irrelevant, I know) but CPU was unaffected. I'm kind of at a loss of what to look for or do at this point. Any suggestions/advice would be greatly appreciated.

Component rundown: 5800x, ASUS b550-f mobo, EVGA RTX3060, Arctic Freezer 280 AIO, Super Flower Leadex III 850W, 32 GB Crucial Ballistix CL16 3600

BIOS and drivers are 100% up to date.

hwinfo logs: https://docs.google.com/spreadsheets/d/1ggHi0jCAs5BACIKK_BJT7cCELAP9mVAvYHBBip8FtSk/edit?usp=sharing
gpu-z logs: https://docs.google.com/spreadsheets/d/1Fc2VbCaYCh_FgW8JDO1Q9o771ZlAXU5dhhazVmOR5_Y/edit?usp=sharing
That deep into a p95 run usually indicates memory errors.
 
welcome to the club of broken AMD CPUs.

RMA it. i went through this a few times now with a bunch of CPUs. (my 5900X was not stable at stock speeds.)

do you have any cache hierachy related errors in the eventviewer after the crashes?
When you posted that, it reminds me of my close buddy's FX9590! When he was just Skyping and the like, it kept going down! I think at that point, I was familiar with the dreaded "There is a problem with this call" message (or similar) from Skype and the familiar sound of losing a call! (2014, when he had to return that FX9590) I suspected that there were a lot of faulty FX 9590s, because I thought I saw multiple posts of FX 9590s being flaky at stock!
 
Last edited:
do you have any cache hierachy related errors in the eventviewer after the crashes?
I get nothing in the event viewer.

Btw, this is my timings that I just experienced the most recent restart. 100% default BIOS settings.
 

Attachments

  • ZenTimings_Screenshot_26942147.7058677.png
    ZenTimings_Screenshot_26942147.7058677.png
    32.4 KB · Views: 87
Last edited:
OK, so I got a new set of memory and have the same thing happening but now I'm seeing WHEA errors to go along with it.

Screenshot 2021-03-24 122640.png
 
OK, so I got a new set of memory and have the same thing happening but now I'm seeing WHEA errors to go along with it.

View attachment 193733

Was this at stock aswell, with no DOCP enabled? Starting to smell like there is an issue with the cpu itself.

Did you try manually setting the CLDO VDDG etc voltages like @Alexa suggested? Although at stock it should run with auto setting without issues.

Your MCLK, FCLK and UCLK values should all be the same. I doubt that's what's causing it, but you should rectify that. It means that the board failed to boot with unified fabric & memory clocks. That will introduce unnecessary amounts of latency during gaming. Either that or it didn't attempt to do it in the first place (which I doubt as I have the same board without WiFi).

Try setting FCLK yourself manually to 1800 in the BIOS.

I would also suggest setting:
- VDDG CCD to 1.050v
- VDDG IOD to 0.950v
- CLDO VDDP to 0.900v
- DRAM voltage to 1.35v if it isn't already set that way
(all of these are under Ai Tweaker)

... and disabling Global C-State control under Advanced --> AMD CBS (this may help with the resets, it helped with my random idle reboots but I'm not sure)

No worries, you won't break anything by doing this. It'll either fix the affected 3 clocks mentioned above or not fix anything at all, in which case you can easily reset them back.

For example, this is how mine looks like:

HCQE8Ow.png
 
I just finished building a new system this last week and twice now I've had hard resets playing Shadow of the Tomb Raider. In one case it was after an hour or two and the second time it was within ~15 min. The second time I had both hwinfo and gpu-z logging running. Everything seems fine from the logs (links below) but maybe someone can see something I don't. I don't think the restart is thermal related. The CPU was at around 50C at the time of the restart. In games it seems to be around 50-60C and idle in the low 30s. Since the restarts I completely cleared the CMOS and went 100% stock (no XMP). Since then, I haven't seen any restarts but I've been running prime95 tests and always run into errors within 2 hours. The CPU does get very hot in the tests (up to 90C) but I'm not sure if this is typical or not for prime95 since my normal load temps seem fine. I've also run memtest86 with XMP enabled and FCLK at 1800 and 3 passes (3 and half hours) there were no errors so I don't suspect the RAM at this point. My last runs of P95 I had the case open with a giant fan blowing in. It didn't really seem to affect anything. My GPU temps were way down (irrelevant, I know) but CPU was unaffected. I'm kind of at a loss of what to look for or do at this point. Any suggestions/advice would be greatly appreciated.

Component rundown: 5800x, ASUS b550-f mobo, EVGA RTX3060, Arctic Freezer 280 AIO, Super Flower Leadex III 850W, 32 GB Crucial Ballistix CL16 3600

BIOS and drivers are 100% up to date.

hwinfo logs: https://docs.google.com/spreadsheets/d/1ggHi0jCAs5BACIKK_BJT7cCELAP9mVAvYHBBip8FtSk/edit?usp=sharing
gpu-z logs: https://docs.google.com/spreadsheets/d/1Fc2VbCaYCh_FgW8JDO1Q9o771ZlAXU5dhhazVmOR5_Y/edit?usp=sharing

it may not be the ram it all. it may be PBO. disable PBO in mobo bios. some ryzen chips simply aren't stable with PBO on. (is usually on bydefault or on auto in mobo bios) so manually make sure it says disabled under PBO.
 
Was this at stock aswell, with no DOCP enabled? Starting to smell like there is an issue with the cpu itself.

Did you try manually setting the CLDO VDDG etc voltages like @Alexa suggested? Although at stock it should run with auto setting without issues.
Yes, I forgot to update, but I tried the above voltages. Same issue. I've also had the issue occur with both sets of memory at non-DOCP. New set is GSkill.
it may not be the ram it all. it may be PBO. disable PBO in mobo bios. some ryzen chips simply aren't stable with PBO on. (is usually on bydefault or on auto in mobo bios) so manually make sure it says disabled under PBO.
Yes, I don't think it's memory related at this point. I also did try disabling PBO and it still occured.

I guess the question is what I should try next? Mobo would be easiest to try since it's still within Amazon's return window. CPU I got direct from AMD so I have no idea how long that process will take. But their shipping was SLOW so probably a while. I have an older PSU (6 year old EVGA G2 750W that has worked flawlessly) if anyone things that could make any difference whatsoever.
 
Back
Top