• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

WHEA Logger Fatal Hardware error

5/31/2022 0:29:18 - Machine Check Exception
Fatal (Previous Error)
------------------------------------------------------------
0 - Memory Error Section (Primary)
1 - Processor Generic Error Section
Processor type: x86/x64
Instruction set: x64
Error type: Cache
Operation: Data Read
Flags:
Level: 1
CPU Version: 0x0000000000a20f12
Processor ID: 0x2
2 - XPF MCA Section
CPU Vendor: AMD
Processor Number: 0x2
MCG_STATUS: 0x0000000000000000 ()
Instruction Ptr: 0x0000000000000000
MCA Bank: 0x0
MCi_STATUS: 0xbaa00000000c0135 (VAL UC EN MISCV PCC)
Other info: 0x0200000
Model error: 0x000c
MCA error: 0000 0001 0011 0101 (binary)
Misc: 0xd01a0ffe00000000
3 - {c34832a1-02c3-4c52-a9f1-9f1d5d7723fc}



CPER�������ω�N��s,�q1�o�蜑�L��e��I�R4Pv�t��P��do�N�c>��|������v��G�K�^�����$��BWE�3V^\����'�2H��RL��]w#���hCw�t�5����
�M�';�

Can't figure out if it's a memory issue or a wd issue.

I'm running xmp at 3200mhz and curve optimizer at negative 20 all cores. happens only when im asleep and pc is idle.
 
Full system specs required
 
raise your SoC voltage to 1.1, should help.

And since we've got no context to go on, remove any overclocks, PBO settings and undervolts you're running - that type of error can be from incorrect RAM or CPU settings (but seeing dual rank memory on ryzen, my bets on the SoC voltage)
 
raise your SoC voltage to 1.1, should help.

And since we've got no context to go on, remove any overclocks, PBO settings and undervolts you're running - that type of error can be from incorrect RAM or CPU settings (but seeing dual rank memory on ryzen, my bets on the SoC voltage)
My PC would reboot on it's own no bsod or dump files just a Whea logger in windows events.
will try raising SoC and see how it goes.
Thanks.
 
My PC would reboot on it's own no bsod or dump files just a Whea logger in windows events.
will try raising SoC and see how it goes.
Thanks.
crashing to a black screen and random reboots are very common for an overworked ryzen memory controller, raising the SoC voltage helps.

While you're on Zen 3 and not zen 1, this image explains it really well:
1653981489337.png


From your screenshots you're running two dual rank 16GB sticks (So 32GB, 4 ranks total) - the second easiest setup to run on ryzen (meaning for 3200Mhz which is technically overclocking, a small boost to the memory controllers voltage is often all anyone needs to get it stable)

People always get this wrong, but your 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz... but with minor tweaking, or lowering clockspeed it can be made to work.

(For comparison i run 2x32GB sticks that are dual rank like your setup, and need 1.10V for 3600Mhz and 1.15V for 3800Mhz. I have a higher quality motherboard designed for overclocking, so you may need slightly higher voltages for the same setup)
 
I'm running xmp at 3200mhz and curve optimizer at negative 20 all cores. happens only when im asleep and pc is idle.
Negative 20 on all cores... yea crashy crashy makes sense.
 
Negative 20 on all cores... yea crashy crashy makes sense.
I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler.
I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.
 
I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler.
I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.
You don't need to touch SOC at 3200mhz DRAM.
 
Can you elaborate more?
At your ram speed, you don't need to do anything, its base speed. If your chip cannot handle 3200mhz w/o increasing voltage you have bigger problems. It's likely the issue was a too aggressive negative curve on voltage.
 
At your ram speed, you don't need to do anything, its base speed. If your chip cannot handle 3200mhz w/o increasing voltage you have bigger problems. It's likely the issue was a too aggressive negative curve on voltage.
I see. Well according to Mussels post " 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz" it's not guaranteed to run normally so i guess i'll just leave SoC at 1.1v for a week and set it back to default and test again if it's stable.

Thanks for your input.
 
I see. Well according to Mussels post " 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz" it's not guaranteed to run normally so i guess i'll just leave SoC at 1.1v for a week and set it back to default and test again if it's stable.

Thanks for your input.
I'll say it again you don't need to touch SOC. Your issue was most likely your high neg offset. You can believe or not. That pic is base speeds at maxed memory support, it's a whole different thing.
 
I'll say it again you don't need to touch SOC. Your issue was most likely your high neg offset. You can believe or not. That pic is base speeds at maxed memory support, it's a whole different thing.
Okay. I'll set SoC back to default and will report back if the pc crashes.
 
I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler.
I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.
I missed the -20

Honestly thats likely far too low, heck most of my cores dont even go below -10 with a few that crash at idle around -6
 
Hello!

I'm sorry to bump an old thread, but I'm coming from Google and I wanted to ask if this event ID 1 WHEA error was ever figured out? I've gotten 4 or 5 of these errors and they're spaced out by about 2-3 weeks each, but there's no PC crashes/restarts when event 1 occurs. (I've gotten BSOD'd by WHEA-logger event ID 18 in the past, however I'm positive that was due to loose wires/RAM as that problem in particular has ceased since I re-seated everything in my PC case)

I have 2 SSDs. Both are by Samsung. One is my 1 year old 1TB 970 Evo plus NVME m.2 drive and my secondary is a 3 month old 870 2TB SATA drive. The secondary drive is apparently the "culprit" as every whea logger event 1 mentions it in the hex, however... as far as I'm aware, there's nothing wrong with it? Seems healthy in CrystalDiskView with no errors...

I recently checked out NirSoft's "FullEventLogView" program and I was able to easily see what else was going on when that error occured. Something about a health event being logged and the device potentially failed? And fault description "Too many NonQueue error"... What did that even mean? Google gives me nothing.

Picture:
n0JiRjc.png



While this whea logger error seems harmless, I'm wondering if this is just Windows 10 weirdness or this a sign my secondary SSD is faulty?
 
Last edited:
Hello!

I'm sorry to bump an old thread, but I'm coming from Google and wanted to ask if this event ID 1 WHEA error was ever figured out? I've gotten 4 or 5 of these errors and they're spaced out by a 2-3 weeks and there's no PC crashes/restarts when event 1 occurs (although I've gotten BSOD WHEA-logger event ID 18 in the past, but I'm positive that is due to loose wires/RAM as they have ceased since I re-seated everything).

I have 2 SSDs. Both are by Samsung. One is my 1 year old 1TB 970 Evo plus NVME m.2 drive and my secondary is a 3 month old 870 2TB SATA drive. The secondary drive is apparently the "culprit" as every whea logger event 1 mentions it in the hex, however... as far as I'm aware, there's nothing wrong with it? Seems healthy in CrystalDiskView with no errors...

I recently checked out NirSoft's "FullEventLogView" program and I was able to easily see what else was going on when that error occured. Something about a health event being logged and the device potentially failed? And fault description "Too many NonQueue error"... What did that even mean? Google gives me nothing.

Picture:
n0JiRjc.png



While this whea logger error seems harmless, I'm wondering if this is just Windows 10 weirdness or this a sign my secondary SSD is faulty?
I can only speak from experience, but I had the error being spammed for a long time for one of my HDDs more than a year ago. Not only is the drive still functional to this day and works flawlessly but the errors eventually stopped appearing. I have no idea what caused the errors to stop, because I just dismissed the errors as my system continued to work without issues and I haven't been checking the event viewer since. I can only assume they were stopped randomly or by some system change (bios, driver, chipset or windows update?).
 
  • Like
Reactions: n_n
I can only speak from experience, but I had the error being spammed for a long time for one of my HDDs more than a year ago. Not only is the drive still functional to this day and works flawlessly but the errors eventually stopped appearing. I have no idea what caused the errors to stop, because I just dismissed the errors as my system continued to work without issues and I haven't been checking the event viewer since. I can only assume they were stopped randomly or by some system change (bios, driver, chipset or windows update?).
Hmm I see, perhaps it's just Windows being dumb after all... maybe something in task scheduler.

Thanks for coming back to this thread to make a reply, it's good to hear your drive is still working good!
 
For Zen 3 with a WHEA crash, especially when they're hard to pin down the circumstance? RMA

Definitely RMA. Scratched my head with a 5900X for far too long before biting the bullet and it was easily accepted and ended up with a 5800X3D instead.
 
For Zen 3 with a WHEA crash, especially when they're hard to pin down the circumstance? RMA

Definitely RMA. Scratched my head with a 5900X for far too long before biting the bullet and it was easily accepted and ended up with a 5800X3D instead.
It's usually related to the RAM setup on Zen3

Look at the officially supported RAM speeds on the 5800x3D, where they made them easily visible for once
1674210224985.png



n_n doesn't mention the details of his ram, but 16GB sticks are almost all dual rank, so he's at 3200 or above, he likely needs to raise the SoC voltage to prevent WHEA errors
 
It's usually related to the RAM setup on Zen3

Look at the officially supported RAM speeds on the 5800x3D, where they made them easily visible for once
View attachment 280018


n_n doesn't mention the details of his ram, but 16GB sticks are almost all dual rank, so he's at 3200 or above, he likely needs to raise the SoC voltage to prevent WHEA errors
From Memory I tested multiple kits that were QVL for my board, plus testing them at base jedec speeds, the clincher was testing that same CPU on a different board and still getting the error...

I've been happy with my Ryzen journey, but I can imagine many wouldn't if they came up against these issues that can be thoroughly hard to pin down. Even the shop I got the 5900x from tried to not RMA it at first till I listed the exhaustive troubleshooting already attempted..

For my money, some Zen 3 cpus are duds if they show these errors.
 
Last edited:
From Memory I tested multiple kits that were QVL for my board, plus testing them at base jedec speeds, the clincher was testing that same CPU on a different board and still getting the error...

I've been happy with my Ryzen journey, but I can imagine many wouldn't if they came up against these issues that can be thoroughly hard to pin down. Even the shop I got the 5900x from tried to not RMA it at first till I listed the exhaustive troubleshooting already attempted..

For my money, some Zen 3 cpus are duds if they show these errors.
It's rare to have actually faulty CPU's, but the IMC quality varies

My ITX board was an issue for it, the ram would never ever go above 3200/IF above 1600 - just couldnt do it
If you tried, you'd get WHEA errors, sata disconnects, PCI-E crashes resulting in GPU black screening, USB dropouts, all the fun

A 5600x is no slouch, but it's not going to be binned for a higher quality IMC either - so either he's got the WD drive that gave people random errors, or he can likely fix this by adjusting ram settings or the SoC voltage
 
It's usually related to the RAM setup on Zen3

Look at the officially supported RAM speeds on the 5800x3D, where they made them easily visible for once
View attachment 280018


n_n doesn't mention the details of his ram, but 16GB sticks are almost all dual rank, so he's at 3200 or above, he likely needs to raise the SoC voltage to prevent WHEA errors
Apologies, my RAM is running at 3600mhz--the default XMP setting with the click of one button, so I haven't touched anything extra relating to voltages. If I start getting Whea-logger event ID 18 BSODs again, I'll look into bumping up the SoC value.

Here are all the tabs of cpu-z:
mfVslDT.png
 
It's usually related to the RAM setup on Zen3

Look at the officially supported RAM speeds on the 5800x3D, where they made them easily visible for once
View attachment 280018


n_n doesn't mention the details of his ram, but 16GB sticks are almost all dual rank, so he's at 3200 or above, he likely needs to raise the SoC voltage to prevent WHEA errors
I'm running 2x 32GB 3600MHz CL18 (2x2R) on a 5900X/X570S with no adjustment to any voltages, never had any issues like this. Previously ran 2x 16GB 3600MHz CL18, also 2x2R, on a 5800X/X470 - again with no voltage tweaks and no WHEA errors. I think you're a skewed sample size due to your penchant for over-/under-clocking, or maybe your ITX board was just a dud.
 
Back
Top