• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

WHEA-Logger Event 19 errors on 3900X

Joined
Aug 9, 2020
Messages
135 (0.08/day)
  • CPU: Ryzen 9 3900X
  • CPU Cooler: Noctua NH-D15 Chromax Black
  • Thermal Compound: Noctua NT-H2
  • GPU: Zotac RTX 2070 Mini
  • RAM: 2x 16 GB G.Skill Ripjaws V 3600 MHz CL16
  • Mobo: ASUS ROG Strix B550-F, BIOS Ver. 1202, AGESA 1.1.0.0 Patch B
  • PSU: Corsair CX 650 2017 80+ Bronze Non-Modular
  • Storage: 1x 500 GB Samsung 980 PRO PCIe 4.0 NVMe + 1x SanDisk X400 512 GB SSD + 2x Crucial MX500 1 TB SSD
  • Case: be quiet! Pure Base 500DX Black
  • Airflow: 3x be quiet! Pure Wings 2 High-Speed 140mm PWM, 2x front intake, 1x rear exhaust
  • Displays: 1x 1080p144, 1x 1080p75
  • OS: Windows 10 Pro 20H2 (19042.630)
  • Mouse: Razer Mamba Elite
  • Keyboard: ROCCAT Horde AIMO
Most of you already know my story here but I'll explain it again in detail.

I have been having issues with idle hard reboots on this Ryzen system, when C-states are enabled. It doesn't happen with them disabled.

At first, I thought it was my PSU. Which it wasn't - it happens with another PSU still.

Before getting a ROG Strix board (because my Gigabyte board died - long story), I had a B550 Aorus Pro as my first Ryzen system and Corsair Vengeance LPX 3200 CL16 RAM. The CPU, PSU and GPU remained unchanged. GPU is not causing this -- tested with my old HP GTX 1060, same problem.

These shutdowns were happening on that system as well, and they continue on this system.

To fix this, I tried to bump the voltage on my DRAM to 1.37v from 1.35v thinking this is a DRAM issue and my memory controller is just weak.

That seemed to have fixed the issue, except it barely did. Sure, it doesn't shut down on full idle anymore with all non-essential processes closed, but instead it does this:

mmc_W6vdEpMcgi.png


All of these errors are the exact same. They happened overnight while I left the PC to idle.

When I saw these errors, I thought maybe my SoC needs more voltage and so I increased the voltage from 1.1v to 1.125v, causing HWiNFO to report 1.106-1.119v SoC voltage in Windows from the previous 1.081v - 1.087v.

This didn't fix anything, and it in fact made things worse as my PC was now rebooting again. Set SoC back to Auto and from what I've seen it doesn't reboot anymore. I have yet to see WHEA errors again, but I'm sure they'll appear again overnight.

These errors don't happen all the time, only when the PC is left idle overnight, happens like 7 times per night.

I have no idea what to do to fix this anymore. There is a beta BIOS for my board which includes AGESA 1.1.0.0 Patch C, but I'm afraid to update to that. I'll wait until it gets out of Beta.

Does anybody else have these issues or is it just me?
 
I have exactly the same issue, Asus Strix X570-E Gaming board with the previous 2812 beta BIOS. (and if I remember correctly I also had reboot issues already with the 2802 BIOS which is the AGESA patch B)
Reboots are completely random but also had it 1-2 hours ago several times in a row it seems (according to event viewer) while I left my desk and PC was on idle (display turned off).

Problem is, I also overclocked RAM plus SOC, VDD... voltages to hopefully keep the 1800 FCLK _and_ updated to Windows 10 build 2004 almost at the same time ... so not sure what the exact problem is.
Things I still have to try:

- Lower FCLK to something more sensible (heard that maybe not all Zen2 will reach 1800 FCLK) and put voltages on Auto again and/or keep DRAM on default settings
- Update Windows to 20H2 (don't really looking forward to that right now ...)
- Update to latest beta BIOS 2816 ... although same AGESA patch it seems, not sure what was done otherwise on this BIOS, I guess latest changes are for Zen3 anyway which doesn't concern me.

What I heard is that latest BIOS/AGESA versions are much more aggressive on boost clocks and maybe not as stable (since V2 PI 1.1.0.0 Patch B? not sure there) as the previous ones.
Maybe downgrading BIOS would be an option for me!?

It's just painful to test all that because putting in DRAM settings each time again takes it's time (after BIOS update every profile is gone ... hate that) and also the reboots are rather random and do not happen very often ... at first I thought it was a PSU problem because of sudden reboot (but it is not, switched PSU since then) without any real error messages in the event viewer most of the time ... what I always see is that kernel processor power logs in the system event log always posting "Nominal Frequency (MHz): 3793" (basically max boost) for all cores at the time before reboot ... not sure if there is a bug somewhere or just hardware error because of OC.
 
There are beta BIOSes for that motherboard that claim to improve stability. Might want to check them out / wait until they come out of beta and then update.
 
Hm, don't think that I want my CPU to be on max power all the time.
Maybe there is an instability problem with the latest BIOS versions on Zen2? ...
 
it's not gonna be on max power
you're just gonna lose out on very low idle temps

i personally see no difference between cstates on and off. same boost behavior. and the biggest perk of no c-states... no random reboots. no WHEA errors.

no BIOS update so far has ever fixed this and i doubt itll ever be fixed.
 
I got the same issues. Maybe I should try it.
 
it's not gonna be on max power
you're just gonna lose out on very low idle temps

i personally see no difference between cstates on and off. same boost behavior. and the biggest perk of no c-states... no random reboots. no WHEA errors.

no BIOS update so far has ever fixed this and i doubt itll ever be fixed.

There's also another aspect to try when running into WHEA errors; there are two domains on Ryzen that have C-states.

The first is for the CPU and easy to access under whatever Advanced CPU Settings menu you have, can help if you have issues in idle.

The second is disabling the data fabric C-states, useful if you push the memory controller or Infinity Fabric hard. I only know that on Gigabyte it's found under CBS > DF Common Options, but it may be under a different name on other brands. Disabling it should prevent the IF from idling, which is obviously very unhelpful on desktop.

Logically speaking, if you have trouble pushing IF without WHEA errors/instability/USB errors/PCIe problems, you can try both disabling data fabric C-states, and enabling Uncore/SOC OC Mode (both ways to stop Infinity Fabric from downclocking or going into a lower power state).

EDIT: on Asus the CPU menu C-states is Global so it controls both CPU and DF C-states.
 
Last edited:
There's also another aspect to try when running into WHEA errors; there are two domains on Ryzen that have C-states.

The first is for the CPU and easy to access under whatever Advanced CPU Settings menu you have, can help if you have issues in idle.

The second is disabling the data fabric C-states, useful if you push the memory controller or Infinity Fabric hard. I only know that on Gigabyte it's found under CBS > DF Common Options, but it may be under a different name on other brands. Disabling it should prevent the IF from idling, which is obviously very unhelpful on desktop.

Logically speaking, if you have trouble pushing IF without WHEA errors/instability/USB errors/PCIe problems, you can try both disabling data fabric C-states, and enabling Uncore/SOC OC Mode (both ways to stop Infinity Fabric from downclocking or going into a lower power state).

EDIT: on Asus the CPU menu C-states is Global so it controls both CPU and DF C-states.

@tabascosauz -- This helped quite a bit on the ASUS Crosshair VIII w/ 3003 BIOS going from Random Reboots to now just sporadic recoverable WHEA-Logger 19s.




My last most recent stable bios was 2402 Beta w/o requiring all those tweaks mentioned.

---

Currently no Random Reboots w/ Latest BIOS 3003 using the above recommendations:

"A corrected hardware error has occurred.

Reported by component: Processor Core
Error Source: Unknown Error Source
Error Type: No Error
Processor APIC ID: 0

The details view of this entry contains further information."

1607817259276.png
 
unstable memory (and/or IF?) trigger WHEA errors and reboots at idle here with 5800x, so work on RAM stability
 
unstable memory (and/or IF?) trigger WHEA errors and reboots at idle here with 5800x, so work on RAM stability
is the IF instability hinging on the Mobo and or CPU?
 
@tabascosauz -- This helped quite a bit on the ASUS Crosshair VIII w/ 3003 BIOS going from Random Reboots to now just sporadic recoverable WHEA-Logger 19s.

My last most recent stable bios was 2402 Beta w/o requiring all those tweaks mentioned.
---
Currently no Random Reboots w/ Latest BIOS 3003 using the above recommendations:

"A corrected hardware error has occurred.

Reported by component: Processor Core
Error Source: Unknown Error Source
Error Type: No Error
Processor APIC ID: 0

The details view of this entry contains further information."

All these Event 19s are from just running 3600MT/s?

I get these all the time on 3733, one of the reasons why I went back to 3600 16-19-19 on my main desktop. The profile is extremely stable, but the CPU is struggling for daily use above 3600.

3733 for me still accumulates disk corruption over time. The memory itself is rock solid, but the occasional abrupt reboot every few weeks from the IF is still enough to slowly build disk errors.

If you're on the ragged edge of what your CPU's IF can do, there's not much that you can do except go down to a speed that you know will work error-free. Pump more VSOC and/or tweak the CLDOs, and it might throw more Event 19s. Reduce VSOC, and it might throw unidentifiable Event 19s. Disable DF/CPU C-states, but it can only get you so far and sometimes won't do anything.

is the IF instability hinging on the Mobo and or CPU?

It's mostly CPU. If your IF stops at about 1800MHz, you can try a dozen different boards and it'll all be about the same. But for example, say if you're having trouble hitting a 4600 profile on a set of B-die that you fully know can do it, then that would be a board problem, because the Ryzen 3000/4000/5000 memory controller itself is almost always fully capable of 5000+.

The new AGESA/BIOS release notes always have something to say about "improved DRAM compatibility" or "better FCLK scaling" but there's not a whole lot that can be done firmware-side.
 
The latest MSI AGESA BIOS fixed WHEA errors for me. Weird.
 
All these Event 19s are from just running 3600MT/s?

I get these all the time on 3733, one of the reasons why I went back to 3600 16-19-19 on my main desktop. The profile is extremely stable, but the CPU is struggling for daily use above 3600.

3733 for me still accumulates disk corruption over time. The memory itself is rock solid, but the occasional abrupt reboot every few weeks from the IF is still enough to slowly build disk errors.

If you're on the ragged edge of what your CPU's IF can do, there's not much that you can do except go down to a speed that you know will work error-free. Pump more VSOC and/or tweak the CLDOs, and it might throw more Event 19s. Reduce VSOC, and it might throw unidentifiable Event 19s. Disable DF/CPU C-states, but it can only get you so far and sometimes won't do anything.



It's mostly CPU. If your IF stops at about 1800MHz, you can try a dozen different boards and it'll all be about the same. But for example, say if you're having trouble hitting a 4600 profile on a set of B-die that you fully know can do it, then that would be a board problem, because the Ryzen 3000/4000/5000 memory controller itself is almost always fully capable of 5000+.

The new AGESA/BIOS release notes always have something to say about "improved DRAM compatibility" or "better FCLK scaling" but there's not a whole lot that can be done firmware-side.

yep XMP Profile 3600

these errors mostly go away with the older 2402 Beta BIOS, and probably much older BIOSes...
 
is the IF instability hinging on the Mobo and or CPU?

Both. BIOS updates, voltages, timings, BIOS settings, physical slots used, amount of slots used, amount or memory ranks used...

It's general RAM instability causing it, so the regular reasons apply.
 
Both. BIOS updates, voltages, timings, BIOS settings, physical slots used, amount of slots used, amount or memory ranks used...

It's general RAM instability causing it, so the regular reasons apply.
sporadic WHEA-Logger 19 = risking corruption?
 
any instability risks corruption

Recieved a 5950X + 3003 BIOS = 0 WHEA-Logger 19 Events ever since... :(

----

Also, this was just released, but i'll wait for a Stable Version first:

Version 3101 Beta Version
2020/12/25 20.38 MBytes
ROG CROSSHAIR VIII HERO (WI-FI) BIOS 3101
Improved system compatibility
Updated AGESA code to ComboV2PI 1190
Updated graphical firmware
Improved RAID function
Improved system performance
 
Recieved a 5950X + 3003 BIOS = 0 WHEA-Logger 19 Events ever since... :(

----

Also, this was just released, but i'll wait for a Stable Version first:

Version 3101 Beta Version
2020/12/25 20.38 MBytes
ROG CROSSHAIR VIII HERO (WI-FI) BIOS 3101
Improved system compatibility
Updated AGESA code to ComboV2PI 1190
Updated graphical firmware
Improved RAID function
Improved system performance
Person with the 3950X i sold isn't reporting any issues...

I'm blaming shoddy ASUS BIOSes for now.. ? (specifically the X570 CH8 Formula WI-FI)
 
Person with the 3950X i sold isn't reporting any issues...

I'm blaming shoddy ASUS BIOSes for now.. ? (specifically the X570 CH8 Formula WI-FI)
i'm totally leaning towards AGESA issues and/or RAM issues for 99% of WHEA errors, since we can trigger them/stop them by raising/lowering IF clocks
 
My 3900X receives WHEA issues only when C-states are on and the system is idling. Otherwise it works perfectly.
 
My 3900X receives WHEA issues only when C-states are on and the system is idling. Otherwise it works perfectly.
that sounds related to the fix they had for USB dropouts, as that also happened at idle with C-states being the fix

I wonder if the next update will solve that for you at the same time
 
Nevermind I guess, I checked event viewer and there was a bus/interconnect WHEA error yesterday.

The AGESA 1.2.0.1 BIOS is still in beta, I'm gonna wait before upgrading.
 
1.2.0.2 should have the fix, that's the one with C-state changes (for the USB issues, but they may help you)
 
Back
Top