• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Still getting WHEA 18 errors all stock

For troubleshooting you can try setting your VSOC to 1.1v then post another screenshot so we can see how your motherboard adjusts the remaining voltages, then do some testing.
VSOC? Is it called something else in bios?
 
VSOC? Is it called something else in bios?
It might just be called SOC voltage. I don't have an MSI UEFI/BIOS to look at the moment. If I recall MSI gives you live voltage readouts as your looking at UEFI/BIOS. You can use that to help confirm the option you want to change because that voltage would match or be very close to your screenshot for vSOC. Perhaps someone here knows the exact name of the option for MSI. It should be near the other voltage settings for VDDP and VDDG. ( Just DO NOT modify the one that says CHIPSET soc voltage. )

If there is any concern take a pic of your UEFI/BIOS and post it before changing anything.
 
It might just be called SOC voltage. I don't have an MSI UEFI/BIOS to look at the moment. If I recall MSI gives you live voltage readouts as your looking at UEFI/BIOS. You can use that to help confirm the option you want to change because that voltage would match or be very close to your screenshot for vSOC. Perhaps someone here knows the exact name of the option for MSI. It should be near the other voltage settings for VDDP and VDDG. ( Just DO NOT modify the one that says CHIPSET soc voltage. )

If there is any concern take a pic of your UEFI/BIOS and post it before changing anything.
CPU NB SoC is right above what you said. Sound about right? So scared of nuking my computer.
 
CPU NB SoC is right above what you said. Sound about right? So scared of nuking my computer.
Look like thats the one
 
That leads to at least one question,

why the hell BIOS don't have one and only name for a setting instead confuse the entire world without reason by letting the dev choosse how to call it ?
 
It's a whea 18 and it say's processor core.


Log Name: System
Source: Microsoft-Windows-WHEA-Logger
Date: 02/07/2024 20:53:45
Event ID: 18
Task Category: None
Level: Error
Keywords:
User: LOCAL SERVICE
Computer: BlaezaLite
Description:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 11

I was playing Xdefiant when this crash happened.

How would I go about lowering the boost clocks to 4400? See if that prevents it?

Does it happen with XMP/DOCP disabled? CHE is usually caused by memory/IF if I am not mistaken.

Cache Hierarchy is not uncore. Cache Hierarchy is cores. I would look at RMA first, unless there's some custom CO settings or Vcore offset in play which it doesn't seem like.

You're thinking of WHEA 19, Bus/interconnect.
 
Cache Hierarchy is not uncore. Cache Hierarchy is cores. I would look at RMA first, unless there's some custom CO settings or Vcore offset in play which it doesn't seem like.

You're thinking of WHEA 19, Bus/interconnect.
Should I try the 1.1V soc setting still? It is totally stock, yes.
 
Cache Hierarchy is not uncore. Cache Hierarchy is cores. I would look at RMA first, unless there's some custom CO settings or Vcore offset in play which it doesn't seem like.

You're thinking of WHEA 19, Bus/interconnect.

Yeah, that's what I was thinking for sure been a while haven't seen a whea error in years and only when pushing IF. Glad you commented was going to tag you originally but didn't want to bother figured you'd see the post and comment if you wanted to.

Oddly I have read people swapping out corsair ram and fixing this issue though.

I would try to RMA the CPU as well though.
 
Last edited:
1.1V
1.1v.png
 
Should I try the 1.1V soc setting still? It is totally stock, yes.

Not really, because that has nothing to do with cores and this nothing to do with Cache Hierarchy. You sure can, but 1.1V is high as hell in relation to 3200. You really shouldn't even need 1.0V at your settings.
 


If you were wanting to try another kit this is what I would go with..... I don't think it will fix your issue though just looking around the interwebs the issue can be almost anything related to the CPU

Some people have reported fixing the issue by RMA the CPU, swapping out the ram, and swapping the motherboard.

I would start with the CPU then the Ram and Mobo last personally.

I've done over 100 Ryzen builds and never actually encountered this issue only the interconnect variant when pushing IF.


Maybe pick one of these up just to verify that is the issue.


Return after verifying ofc.
 
Yeah, that's what I was thinking for sure been a while haven't seen a whea error in years and only when pushing IF. Glad you commented was going to tag you originally but didn't want to bother figured you'd see the post and comment if you wanted to.

Oddly I have read people swapping out corsair ram and fixing this issue though.
I had a really strange case where my 5950x was throwing cache hierarchy errors too (among other errors) but swapping it out into another motherboard it was just fine and the new CPU I swapped in the old motherboard started having it too! So the motherboard might be the issue but I wiped and redid the OS install and now everything works fine! :banghead: I still have more testing to do before ruling out the motherboard as the issue in my case. These issues are no fun to deal with. Amazon flagged B550 Phantom Gaming-ITX/ax as a commonly returned item so maybe there is something up this this motherboard although it was working great for years without an issue.
I would try to RMA the CPU as well though.
Yea RMA.
 
Last edited:
I'll try that ram first, the G-Skill. Then I'll try my old r5 3600. Then I'll have to magic money out my ass for a mobo, lol.

RMAing will be difficult as I am agoraphobic.
 
I'll try that ram first, the G-Skill. Then I'll try my old r5 3600. Then I'll have to magic money out my ass for a mobo, lol.

RMAing will be difficult as I am agoraphobic.
If you find the 3600 works perfectly then your motherboard is probably just fine. Before you spend any money you might just try the 3600 first.
 
If you find the 3600 works perfectly then your motherboard is probably just fine. Before you spend any money you might just try the 3600 first.
I'm getting the ram anyway. But next will be trying the 3600.
 
I had a really strange case where my 5950x was throwing cache hierarchy errors too (among other errors) but swapping it out into another motherboard it was just fine and the new CPU I swapped in the old motherboard started having it too! So the motherboard might be the issue but I wiped and redid the OS install and now everything works fine! :banghead: I still have more testing to do before ruling out the motherboard as the issue in my case. These issues are no fun to deal with. Amazon flagged B550 Phantom Gaming-ITX/ax as a commonly returned item so maybe there is something up this this motherboard although it was working great for years without an issue.

Yea RMA.

Yes, it can be cores defect (RMA only, really) but equally possible the motherboard is pushing bad Vcore settings esp. at idle.

I'll try that ram first, the G-Skill. Then I'll try my old r5 3600. Then I'll have to magic money out my ass for a mobo, lol.

RMAing will be difficult as I am agoraphobic.

Unless that also prevents you from going to the post office, I don't see how that prevents RMA?

Maybe things have changed in the last 2 years. But their depot was in Miami and you can ship via any carrier you want.
 
These issues are no fun to deal with.
Understatement of the year.

What might be my biggest heartache with PCs in about fifteen to twenty years specifically came from troubleshooting this exact issue around half a year ago (WHEA error 18, "cache hierarchy" type). During all of my research, I found basically everything suggested as a possible resolution. Some had success with changing the CPU, some with RAM, some with the motherboard, some with the PSU, and some with the video card.

The last might seem like the least likely given the error type. I also presumed it was also a CPU or maybe platform side issue (RAM or maybe even motherboard/power related). But in my case, the error showed up when changing the video card. And it only happened when the GPU was under light-medium or higher loads. It would be be working fine, only to lose video signal, and then some seconds later it would restart (power was never off in the entire process so the PC never shut down but simply restarted). No BSODs at all. Just event viewer logs (ID 18) and Windows WHEA logs (generic 0x124 for WHEA, and paired with them, four different video related Watchdog logs).

I was lost as to whether the video errors were happening as a result of whatever the real issue was, or the other way around. The big clue was it started after changing the video card, but I wanted to rule out what may have been a possible hidden issue with my existing hardware that the video card change somehow brought about, before sending back a possibly good video card.

I tried just about everything under the sun from updating BIOS multiple times, changing no end of BIOS settings, changing driver versions, even reinstalling Windows. No change from any of it.

I tried swapping the CPU from a 5800X3D to my old 3700X to see if it was a CPU-side issue. No change.

I tried running with half the RAM (unfortunately, I didn't have different RAM to test). No change.

I tried disabling RAM profile speeds and running at default. It got worse somehow. Things that didn't cause the restart before (League of Legends for example), now did. Things that did cause the restart, now did so more often. This part in particular made me cry because this suggested a platform-side issue, but the issue didn't exist on my old video card so I gave up platform troubleshooting for the time being and finally gave in to doing an RMA on the new video card since that where it started. I figured if the issue remained with an RMA (or if Sapphire said they found no issue), then I'd continue efforts on the platform at least knowing the GPU "should be" fine. These "Black screen restarts" were apparently very common on the 7800 XT from feedback online anyway despite the card being a mere two months old at the time, so... maybe there was something to it being the cause of all this afterall?

After the RMA, it was "almost entirely" resolved. I've had one reproducible case that still causes it (a particular version of Minecraft, and a modded one at that) and it seems... weird a game would cause a machine check exception if the hardware wasn't faulty to begin with, but... everything else that was doing it stopped. I'm still fearful the issue may return one day, but so far it hasn't. To say I'm still clueless as to what went on is an understatement.

So my main question to the thread starter would be this; have you changed anything recently?

If not, do you have spare parts you can swap as a test? In my case, this helped me rule things out. I swapped everything but the PSU (no spare), RAM (no spare), and motherboard (I did have my old one as a spare, which as an RMA return itself due to the original having a separate issue, but it'd be time and effort intensive to test that one so I relegated it to after trying the video card RMA since that's where the issue showed up).
 
Yes, it can be cores defect (RMA only, really) but equally possible the motherboard is pushing bad Vcore settings esp. at idle.



Unless that also prevents you from going to the post office, I don't see how that prevents RMA?

Maybe things have changed in the last 2 years. But their depot was in Miami and you can ship via any carrier you want.
I can go across the road to a corner shop and I'm a complete wreck mentally and emotionally when I get home, maybe 20 meters away.
Understatement of the year.

What might be my biggest heartache with PCs in about fifteen to twenty years specifically came from troubleshooting this exact issue around half a year ago (WHEA error 18, "cache hierarchy" type). During all of my research, I found basically everything suggested as a possible resolution. Some had success with changing the CPU, some with RAM, some with the motherboard, some with the PSU, and some with the video card.

The last might seem like the least likely given the error type. I also presumed it was also a CPU or maybe platform side issue (RAM or maybe even motherboard/power related). But in my case, the error showed up when changing the video card. And it only happened when the GPU was under light-medium or higher loads. It would be be working fine, only to lose video signal, and then some seconds later it would restart (power was never off in the entire process so the PC never shut down but simply restarted). No BSODs at all. Just event viewer logs (ID 18) and Windows WHEA logs (generic 0x124 for WHEA, and paired with them, four different video related Watchdog logs).

I was lost as to whether the video errors were happening as a result of whatever the real issue was, or the other way around. The big clue was it started after changing the video card, but I wanted to rule out what may have been a possible hidden issue with my existing hardware that the video card change somehow brought about, before sending back a possibly good video card.

I tried just about everything under the sun from updating BIOS multiple times, changing no end of BIOS settings, changing driver versions, even reinstalling Windows. No change from any of it.

I tried swapping the CPU from a 5800X3D to my old 3700X to see if it was a CPU-side issue. No change.

I tried running with half the RAM (unfortunately, I didn't have different RAM to test). No change.

I tried disabling RAM profile speeds and running at default. It got worse somehow. Things that didn't cause the restart before (League of Legends for example), now did. Things that did cause the restart, now did so more often. This part in particular made me cry because this suggested a platform-side issue, but the issue didn't exist on my old video card so I gave up platform troubleshooting for the time being and finally gave in to doing an RMA on the new video card since that where it started. I figured if the issue remained with an RMA (or if Sapphire said they found no issue), then I'd continue efforts on the platform at least knowing the GPU "should be" fine. These "Black screen restarts" were apparently very common on the 7800 XT from feedback online anyway despite the card being a mere two months old at the time, so... maybe there was something to it being the cause of all this afterall?

After the RMA, it was "almost entirely" resolved. I've had one reproducible case that still causes it (a particular version of Minecraft, and a modded one at that) and it seems... weird a game would cause a machine check exception if the hardware wasn't faulty to begin with, but... everything else that was doing it stopped. I'm still fearful the issue may return one day, but so far it hasn't. To say I'm still clueless as to what went on is an understatement.

So my main question to the thread starter would be this; have you changed anything recently?

If not, do you have spare parts you can swap as a test? In my case, this helped me rule things out. I swapped everything but the PSU (no spare), RAM (no spare), and motherboard (I did have my old one as a spare, which as an RMA return itself due to the original having a separate issue, but it'd be time and effort intensive to test that one so I relegated it to after trying the video card RMA since that's where the issue showed up).
I've changed nothing apart from a nvme drive, which I added about a month ago.
 
Yes, it can be cores defect (RMA only, really) but equally possible the motherboard is pushing bad Vcore settings esp. at idle.
If the motherboard was pushing bad Vcore would increasing CPU LLC possibly take care of that issue?
 
Getting MSI B550 Tomahawk and 5700X3D end of the month. Problem will be fixed.
 
If you can do a B650/7600 combo it would give your platform way more legs. Unless the AM4 system is way cheaper it isn't worth investing in.

Screenshot (1).png
 
Last edited:
I've changed nothing apart from a nvme drive, which I added about a month ago.
If nothing (major) changed, then I would personally start troubleshooting attempts in the area of the platform parts. So, the CPU, motherboard, and RAM namely. And it might not be a bad part, but a part that is merely faulting as a lack of power somewhere along the frequency/voltage curve?

Odd question, but the title says this is at stock; what happens if you apply the RAM profile? I ask because when I was having the issue with my first graphics card, I found it was less stable at stock (but still unstable at RAM profile speeds). If stock and profile speeds behave differently, it might give you a clue as to where to focus your troubleshooting efforts?

Also, look at the "Processor APIC ID" number in all of your event logs. Is this number the same, or is it always different? What the number indicates is "this is the CPU core where the machine check exception was caught". That doesn't mean the issue originates from the CPU though. If the number is sometimes different, it won't help much, but if it's the same, it might strongly indicate a particular core (or thread) is unstable. That could be because it's actually going bad, or just lacking voltage maybe.
 
If you can do a B650/7600 combo it would give your platform way more legs. Unless the AM4 system is way cheaper it isn't worth investing
My limited budget has decided for me.
If nothing (major) changed, then I would personally start troubleshooting attempts in the area of the platform parts. So, the CPU, motherboard, and RAM namely. And it might not be a bad part, but a part that is merely faulting as a lack of power somewhere along the frequency/voltage curve?

Odd question, but the title says this is at stock; what happens if you apply the RAM profile? I ask because when I was having the issue with my first graphics card, I found it was less stable at stock (but still unstable at RAM profile speeds). If stock and profile speeds behave differently, it might give you a clue as to where to focus your troubleshooting efforts?

Also, look at the "Processor APIC ID" number in all of your event logs. Is this number the same, or is it always different? What the number indicates is "this is the CPU core where the machine check exception was caught". That doesn't mean the issue originates from the CPU though. If the number is sometimes different, it won't help much, but if it's the same, it might strongly indicate a particular core (or thread) is unstable. That could be because it's actually going bad, or just lacking voltage maybe.
The apics are 0, 3, 8 and 11. I can use xmp all day and be fine playing games. Then once it does this I panic and come here for help.
 
Only mentioned it because out here they are about the same price +/- 10%
It's not much more £100 or so but I don't have that. Thanks though!
 
Back
Top