• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Random reboots and BSOD on a full AMD build

FrozenOK

New Member
Joined
Jan 24, 2023
Messages
10 (0.01/day)
Hello, how are you everyone? First of all, I'm sorry if my English is a bit bad, well, I'll tell you what's happening with the hope that you can help me and/or guide me with this.

About 2 weeks ago I built my full AMD PC, since I could finally afford the expenses for the complete change CPU, GPU, MOBO and RAM, I had been coming for a long time with a 4th generation i7 the 4790k, 32GB Ram DDR3 and a GTX 970 (about 5-6 years), but when making the change I have encountered some rather unpleasant inconveniences since the day I built it, and this happens when playing some titles, it should be noted that the system behaves in a stable way while I do basic tasks, that is, when I use it to navigate in windows, youtube, discord, among other things, the pc also has no problems starting with or without XMP activated, the problem comes when playing, the first problem I had was playing METRO EXODUS which caused my pc to restart randomly without showing BSOD, it just turned off and on, which after investigating and trying some things I deduced that it could be a CPU or RAM problem, leaning more towards the latter, which it led me to think first in the XMP profile which was active, I deactivated it and put absolutely all the default parameters and even so it kept restarting after playing for a while, then I configured some XMP, SOC and PBO parameters in general manually, reaching a " stability" (I carried out many tests in different sections before arriving at this) that lasted me approximately 5 days until I tried MONSTER HUNTER WORLD which began to have symptoms similar to those I had with METRO EXODUS except that now there were BSODs and it crashed In less time.

I will describe the problems presented with the games so far:


METRO EXODUS (Fixed(?)):


  • Random reboot without BSOD
  • In the event viewer errors of "Event ID 41"
  • Game time approx for problems between 15 - 90 min

PUBG (Fixed(?)):

  • Only frozen image, I listened to everything in the background of the game and discord. I had to restart from the button (only happened once, search for solution and maybe disabling all background hardware acceleration helped (still checking))

MONSTER HUNTER WORLD:

  • BSOD error for "WHEA", event viewer "Event ID: 46"
  • Freezing without BSOD, only it is solved by rebooting
  • Game time approx for problems between 5 - 60 min

Here are the MINIDUMPS and what I got into BSODs playing MONSTER HUNTER WORLD, along with the manual I found to guide me a bit:


Something should appear in the MCA_STATUS section of the Processor Programming Reference (PPR) for AMD Family 19h Model 01h, Revision B1 Processors manual, volume 2. From what I saw in another thread.

It would help me a lot if you can get some clue or information from the cause here, since I have little knowledge of this.

My System:

  • CPU: AMD Ryzen 5 5600x with Hyper T4 (Cooler)
  • MOBO: ASRock B550m Steel Legend (v2.20)
  • GPU: XFX Speedster SWFT309 Radeon RX 6700 (non XT)
  • RAM: Corsair Vengeance RGB Pro 16GB (2x8) DDR4 3200MHz Cl16
  • PSU: Cooler Master v1000 (80+ Gold)
  • SSD: Crucial MX500
  • O.S: Windows 11 PRO 22H2
  1. I rule out problems with the GPU and PSU as they work perfectly on my old Intel system.
  2. My video card is connected to 2x PCI-E 8pin cables from the PSU.
  3. RAMs are located in slots A2 and B2.

RAM PROFILE:


Temps ºC :

  • CPU: 40º - 50º idle / 55º - 65º in game
  • GPU: 30º - 35º idle / 60º - 75º in game
  • RAM: 30º - 35º idle / 40º - 45º in game

What I have tried so far:

  • Clear CMOS
  • Clean each MOBO slot
  • Different BIOS versions from 2021 onwards
  • Use BIOS v2.20 due to possible problems with agesa in later versions
  • Install drivers downloaded from both AMD and the motherboard manufacturer (ASRock)
  • Install windows 10 and windows 11 clean with all its updates
  • Use DDU (in case it was the video card)
  • Use the "Balanced" and "High Performance" power plan
  • Use a "Balanced" power plan with CPU at 5% min and 95% max
  • Use BIOS settings completely by default
  • Disable SVM and Core Isolation* (testing)
  • Use the PBO section in enabled, disabled and auto
  • Different configurations for XMP profile, auto and manual (different frequencies, timmings, voltage settings and infinity fabric)
  • Configure SoC voltages manually (Auto - 1v -1.1v - 1.15)
  • Disable fast boot in windows and BIOS
  • Disable MPO
  • Use CMD with "sfc /scannow" and "dism.exe /online /image-cleanup /restorehealth"
  • Check the status of the SDD (O.S) and HDD (Game Library)
  • Check PSU and GPU
  • Memory tests such as memtest86, occt (between 1 and 3 hours), tm5 (12 cycles with the Extreme1@anta777 configuration) and everything went well without errors.

I feel something is left, but I think there are already mostly listed "possible solutions." Even after all that the problem persists.


Drivers versions used:

  • Chipstet: 3.10.22.706 (Asrock) - 4.11.15.342 (AMD)
  • GPU: Adrenalin 22.10.1 to Adrenalin 22.11.2

The configuration that lasted me 5 days was this:


Drivers:


  • Chipstet: 4.11.15.342 (AMD)
  • GPU: Adrenalin 22.11.2 (AMD)

BIOS v2.50 ASRock B550m:

  • CSM Disabled
  • Enable Secure Boot in Standard profile
  • Check if AMD fTPM is active
  • AMD OC > PBO > Advanced - Motherboard
  • PCI > 4G Decoding: Enable - ReSize BAR: Enable
  • XMP Profile 1 (manual adjustment):

SoC: 1.15v
VDDP: 1.15v
VDDG CDD: 1.05v
VDDG IOD: 1.05v

DRAM: 3200MHz
DRAM: 1.48v or 1.38v (testing - it works both ways)
Infinity: 1600MHz

tCL: 16
tRDCDWR: 18
tRCDRD: 18
tRP: 18
tRAS: 36
tRC: 56 or auto (75)

(playing with SOC voltages, frequencies and latencies I have seen that the game time lasts more or less depending on these values)


On Windows:

  • Clean installation of Windows 10 or 11
  • Use "CMD - sfc /scannow" and verify file integrity (dism.exe /online /image-cleanup /restorehealth)
  • Install the latest GPU and Chipset drivers
  • Check for Windows updates
  • Go to Devices Manager > System Devices > AMD GPIO Controller > Properties > Power Management > Uncheck all (possible conflict with GPU)
  • Disable MPO (possible conflict with GPU)
  • Check "Core Isolation"

Optional:

  • Enable "High Performance" power plan
  • Disable Telemetry
  • Remove Windows 11 context menu
  • Component services > Computers > My Computer Properties > Disable DCOM (enabled) (Possible conflict with GPU?)

All without internet connection.


In the background it runs:


  • Vanguard (Riot)
  • MSI Afterburner (for GPU fans with "disable ULPS" marked)
  • Lightshot
  • ModernFlyouts
  • Wallpaper Engine
  • Windows Defender
  • VibranceGUI

At this point I think it could be the RAM or the IMC of my CPU, I don't rule out problems with the MOBO either.

Everything was "stable" until I started playing MONSTER HUNTER WORLD.

Clarify that some of these steps are "solutions" that I found throughout these days to my "possible" problems.

At this point I don't know what to do anymore, I thought I already had it solved, but no. My head hurts just thinking about looking for solutions and that they don't work for me, I'm already pretty frustrated with this.

RMA is not possible on CPU and RAM.

Sorry if it's a lot of text but I need help.

Thank you very much in advance, I will be reading them.
 
Last edited:
That's a lot of testing! Welcome to TPU

SOC voltage is honestly high. 3200 (1600MHz Fabric) really shouldn't need more than 1.1V at the very most. Is 1.15V default or did you add volts?

Do you have a screenshot of the WHEA logs? Just what Event Viewef shows, no need for full dump

Funny enough B550M Steel Legend is the only AM4 I returned after less than 1 day of use, due to rabdom reboots I couldn't solve.

I would run some ycruncher stress testing, because the subtests in there cover a range of both in-place core-heavy tests and uncore-heavy memory controller tests.
 
Prior testing with that exact ram kit showed I couldn't run any XMP profiles!

Shut off XMP and try it.

Also, I think there is issues with MSI Afterburner lately.
 
That's a lot of testing! Welcome to TPU

SOC voltage is honestly high. 3200 (1600MHz Fabric) really shouldn't need more than 1.1V at the very most. Is 1.15V default or did you add volts?

Do you have a screenshot of the WHEA logs? Just what Event Viewef shows, no need for full dump

Funny enough B550M Steel Legend is the only AM4 I returned after less than 1 day of use, due to rabdom reboots I couldn't solve.

I would run some ycruncher stress testing, because the subtests in there cover a range of both in-place core-heavy tests and uncore-heavy memory controller tests.

Thanks for answering :)

  1. As for the SoC voltages, I tried: Auto - 1 - 1.1 - 1.15, the latter being "stable" at all times until MHW
  2. Right now I'm not at home to take screenshots, but later I can upload them. Any other visual info needed?
  3. Is there a way to rule out problems with the MOBO? I don't have any friends with AMD to test CPU on another MOBO.
  4. I'm going to try ycruncher, I knew it but I didn't try it. How long is the test recommended?

Prior testing with that exact ram kit showed I couldn't run any XMP profiles!

Shut off XMP and try it.

Also, I think there is issues with MSI Afterburner lately.

Thanks for answering :)

I already tried to use the RAM without the XMP profile but still after a while I had a reboot without BSOD.

As for MSI Afterburner, I did tests before installing them but there was no change.
 
Hmm, as you asked, I have to wonder about the motherboard.

Having ran Memtest, I am thinking either the AsRock motherboard or the Cooler Master power supply.

I didn't notice if you had a known good power supply that you could try?
 
VDDP: 1.15v
Why is the VDDP set to 1.15v? It should be 0.900v. Only raise VDDP if you having boot issue due too high FCLK. Otherwise keep it on default (900mv).
 
  1. As for the SoC voltages, I tried: Auto - 1 - 1.1 - 1.15, the latter being "stable" at all times until MHW
  2. Right now I'm not at home to take screenshots, but later I can upload them. Any other visual info needed?
  3. Is there a way to rule out problems with the MOBO? I don't have any friends with AMD to test CPU on another MOBO.
  4. I'm going to try ycruncher, I knew it but I didn't try it. How long is the test recommended?

  1. You can fiddle with SOC, just remember that more =! better, and sometimes higher than necessary VSOC works against you
  2. I think you've covered pretty much everything, aside from information on the WHEAs
  3. Not really unless you have another AM4 board to swap in
If you just download ycruncher from its actual website, it will look like this when you open it up, then under Component Stress Test there's a whole host of different tests you can run. By default it only runs a few minutes per test, but might be worth running through all of them and just seeing how the system responds.

ie. if it crashes on core-heavy tests or UMC-heavy tests, you'll have some idea of what's going on

ycruncher home.png
ycruncher stress tests.png
 
Why is the VDDP set to 1.15v? It should be 0.900v. Only raise VDDP if you having boot issue due too high FCLK. Otherwise keep it on default (900mv).

Thanks for answering.

I understand, I'm going to try leaving the VDDP voltage on Auto.
Do you recommend that I use VDDG CDD and IOD voltages in Auto as well?

I have been in this for a short time related to RAMs, what I tried was based on what I was reading and testing.

  1. You can fiddle with SOC, just remember that more =! better, and sometimes higher than necessary VSOC works against you
  2. I think you've covered pretty much everything, aside from information on the WHEAs
  3. Not really unless you have another AM4 board to swap in
If you just download ycruncher from its actual website, it will look like this when you open it up, then under Component Stress Test there's a whole host of different tests you can run. By default it only runs a few minutes per test, but might be worth running through all of them and just seeing how the system responds.

ie. if it crashes on core-heavy tests or UMC-heavy tests, you'll have some idea of what's going on

View attachment 280740View attachment 280739

Here I have what appears to me in the event viewfinder about the Whea erorrs.

Right now I made some changes in the BIOS that they recommended.

Modification only in the SoC and DRAM voltage, the rest in Auto, tRAS 36 > 40 and the tRC 56 > Auto.

I tried to play MHW 50 min and there was no restart, I have to continue testing.

I have yet to try and crunch.

Hmm, as you asked, I have to wonder about the motherboard.

Having ran Memtest, I am thinking either the AsRock motherboard or the Cooler Master power supply.

I didn't notice if you had a known good power supply that you could try?

That Cooler Master power supply is the only one I have and I have not had problems in my previous system since I have used it for years, I still tried it with a PSU tester before making the change, but I can get me one just to try.
 

Attachments

  • WHEA 1.png
    WHEA 1.png
    41.7 KB · Views: 98
  • WHEA 2.png
    WHEA 2.png
    63.7 KB · Views: 97
  • WHEA 3.png
    WHEA 3.png
    63.8 KB · Views: 91
  • WHEA 4.png
    WHEA 4.png
    63.9 KB · Views: 90
  • WHEA.png
    WHEA.png
    31.5 KB · Views: 100
Last edited:
Thanks for answering.

I understand, I'm going to try leaving the VDDP voltage on Auto.
Do you recommend that I use VDDG CDD and IOD voltages in Auto as well?

I have been in this for a short time related to RAMs, what I tried was based on what I was reading and testing.

Here I have what appears to me in the event viewfinder about the Whea erorrs.

Right now I made some changes in the BIOS that they recommended.

Modification only in the SOC and DRAM voltage, the rest by car, the after 36> 40 and the TRC 56> Auto.

I tried to play MHW 50 min and there was no restart, I have to continue testing.

I have yet to try and crunch.

The VDDGs are good to know but generally if they're not obscenely wrong or causing you Bus/Interconnect WHEAs, you shouldn't need to touch them.

Should probably send a Zentimings screenshot too, if just to see everything in one place: https://zentimings.protonrom.com/

Are those the only WHEA entries in there? No yellow Warning WHEAs with more info? The Machine Check Exception literally just tells you that your computer rebooted, nothing else. Usually it has extra information on whether it's Cache Hierarchy or Bus/Interconnect:

whea errors - Copy.png
 
I see a WHEA error that's pointing to the RAM!
 
The VDDGs are good to know but generally if they're not obscenely wrong or causing you Bus/Interconnect WHEAs, you shouldn't need to touch them.

Should probably send a Zentimings screenshot too, if just to see everything in one place: https://zentimings.protonrom.com/

Are those the only WHEA entries in there? No yellow Warning WHEAs with more info? The Machine Check Exception literally just tells you that your computer rebooted, nothing else. Usually it has extra information on whether it's Cache Hierarchy or Bus/Interconnect:

View attachment 280765

Yes, those are the only WHEA entries I have in the event viewer.

When starting Zentimings, this error appears, but later the same info appeared, the same thing had happened to me before.
 

Attachments

  • ZenTimings A2.png
    ZenTimings A2.png
    51 KB · Views: 120
  • ZenTimings B2.png
    ZenTimings B2.png
    44.8 KB · Views: 101
  • ZenTimminggs.png
    ZenTimminggs.png
    36.2 KB · Views: 90
Yes, those are the only WHEA entries I have in the event viewer.

When starting Zentimings, this error appears, but later the same info appeared, the same thing had happened to me before.
Make sure you run as administrator. I suspect Windows blocked loading.
 
Yes, those are the only WHEA entries I have in the event viewer.

When starting Zentimings, this error appears, but later the same info appeared, the same thing had happened to me before.

The I/O driver message happens sometimes but generally I've only seen it on extremely unstable setup, bad sticks (I have a dying/dead A0 B-die kit lying around), old BIOS, or bad BIOS.

Can you reflash 2.20 or try 2.30? Generally AGESA 1206 is still fine, it's 1207 that takes a dump on everything's performance. Especially if you've not tried any other BIOSes yet, sometimes AGESA is okay but the vendor just fucks up that one BIOS in particular, seen that on a couple of boards.

Bottom line, on Vermeer on a stable BIOS, none of the readouts in Zentimings should be unavailable (except VDIMM depending on board). Only Matisse CPUs omit some CLDOs, and Renoir/Cezanne APUs omit VDDGs due to their design.

Maybe I'm biased with the B550M Steel Legend, but I'm still leaning towards bad board/bad BIOS.


When I was on Vermeer (5900X) I ran a AGESA 1203 BIOS but SMU was 56.53.0. Not sure why your 1203 BIOS is stuck at 56.52.0. Last I checked iirc it's not open to vendors to just pick and choose when it comes to AGESA.
 
Last edited:
Corsair Vengeance and Ryzen it just never works out.

If you can, return that RAM. Get some Crucial Ballistix or G.skill Neo if you need RGB.
 
drop the ram and if/uclk dividers all the way back to jdec
if it still crashes at jdec
I would ditch the board and the ram kit I have seen nothing but problems with that board and corsair ram on AMD
 
Make sure you run as administrator. I suspect Windows blocked loading.
I opened it in administrator, after that first error it no longer appeared and showed the same values.

The I/O driver message happens sometimes but generally I've only seen it on extremely unstable setup, bad sticks (I have a dying/dead A0 B-die kit lying around), old BIOS, or bad BIOS.

Can you reflash 2.20 or try 2.30? Generally AGESA 1206 is still fine, it's 1207 that takes a dump on everything's performance. Especially if you've not tried any other BIOSes yet, sometimes AGESA is okay but the vendor just fucks up that one BIOS in particular, seen that on a couple of boards.

Bottom line, on Vermeer on a stable BIOS, none of the readouts in Zentimings should be unavailable (except VDIMM depending on board). Only Matisse CPUs omit some CLDOs, and Renoir/Cezanne APUs omit VDDGs due to their design.

Maybe I'm biased with the B550M Steel Legend, but I'm still leaning towards bad board/bad BIOS.


When I was on Vermeer (5900X) I ran a AGESA 1203 BIOS but SMU was 56.53.0. Not sure why your 1203 BIOS is stuck at 56.52.0. Last I checked iirc it's not open to vendors to just pick and choose when it comes to AGESA.
I could try again with v2.30 to see how it goes. and check if the SMU changes, I hadn't really paid attention to it.

Corsair Vengeance and Ryzen it just never works out.

If you can, return that RAM. Get some Crucial Ballistix or G.skill Neo if you need RGB.
Unfortunately I can't return the RAM, but I'll try to try some HyperX from a friend.

drop the ram and if/uclk dividers all the way back to jdec
if it still crashes at jdec
I would ditch the board and the ram kit I have seen nothing but problems with that board and corsair ram on AMD
I will try a couple more times to use them in jdec, the last time I did it in the same way it crashed, I hope that now they can work well.
 
re-seat the cpu and re-mount the cooler as well
its not usually a problem with ZIF sockets but I have seen stranger things this week ...
 
re-seat the cpu and re-mount the cooler as well
its not usually a problem with ZIF sockets but I have seen stranger things this week ...
I will keep it in mind to test it in the next few days, in case nothing else works. :(
 
@FrozenOK I do see a very high ProcODT in your screenshot of Zentimings with just two 1 rank dimms installed. You dont need a high value like 60 Ω.
Also TWRRD value isn't the same on both channels.

My suggestion is to use the following:
ProcODT: 36.9 (or 40)
RttNom: 34 (RZQ/7)
RttWr: Off
RttPark: 48 (RZQ/5)

ClkDrvStr: 24
AddrCmdDrvStr: 20
CsOdtDrvStr: 24
CkeDrvStr: 24

tRDWR 8
tWWRD 4
 
@FrozenOK I do see a very high ProcODT in your screenshot of Zentimings with just two 1 rank dimms installed. You dont need a high value like 60 Ω.
Also TWRRD value isn't the same on both channels.

My suggestion is to use the following:
ProcODT: 36.9 (or 40)
RttNom: 34 (RZQ/7)
RttWr: Off
RttPark: 48 (RZQ/5)

ClkDrvStr: 24
AddrCmdDrvStr: 20
CsOdtDrvStr: 24
CkeDrvStr: 24

tRDWR 8
tWWRD 4

I'm going to try adding these values to the current settings and test overnight to see if it makes any changes. Thanks for the recommendation.
 
My advice that you are not going to like.
Toss away Win11 and buy a Real retail license of Win 10 Pro.

All your Win11 hardware drivers pack they are immature software.
 
My advice that you are not going to like.
Toss away Win11 and buy a Real retail license of Win 10 Pro.

All your Win11 hardware drivers pack they are immature software.

Thanks for your answer, but before using Windows 11 Pro, I was using Windows 10 Home with its respective license, although I have no problem going back to Windows 10 if it's for testing.
 
My advice that you are not going to like.
Toss away Win11 and buy a Real retail license of Win 10 Pro.

All your Win11 hardware drivers pack they are immature software.
complete and utter non-sense
please do not comment again
 
Are you using PS2 ports? If yes try usb
also try older 2021 Radeon drivers
 
Last edited:
Corsair vengeance...
 
Back
Top