• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

WHEA Logger Event ID 18

strikemaker

New Member
Joined
Jul 24, 2022
Messages
10 (0.01/day)
Hey guys,

Since weeks i've had problems with my pc. Randomly shutting down (90% of the time with a BSOD).

CPU: AMD Ryzen 7 5800X 3.80 GHz
Motherboard: MSI MPG X570 Gaming Edge Wifi
GPU: AMD Radeon RX 5700 XT Red Devil 8GB
Memory: Corsair DIMM 32GB DDR4-3600 (2x 18GB)

Samsung SSD 860 EVO 1TB
NVMe Samsung SSD970 SCSI Disk Device
KINGSTON SA400S37480G

First time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0

Second time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 14

Anyone that could help?
Updated my BIOS last week to current version.
CPU & GPU drivers are up to date.
 
Run your DDR4 at default speeds (ie, disable XMP) for a week and see if that makes a difference.

I've built several hundred Ryzen 3000 and 5000 computers and the cache heirarchy error I've only seen twice, both times I identified it as a faulty CPUs by swapping the CPU and only the CPU with another PC, only to see the problem jump with the CPU.

AMD replaced both times under warranty.
 
That's some rare memory sticks you got there...

32GB (2x16GB) Corsair Vengeance RGB PRO DDR4-3600 RAM CL18 (18-22-22-42) Kit​

https://www.cyberport.at/marken/corsair.html

Run your DDR4 at default speeds (ie, disable XMP) for a week and see if that makes a difference.

I've built several hundred Ryzen 3000 and 5000 computers and the cache heirarchy error I've only seen twice, both times I identified it as a faulty CPUs by swapping the CPU and only the CPU with another PC, only to see the problem jump with the CPU.

AMD replaced both times under warranty.
Apperantly XMP was already disabled.
Just enabled it. See if that makes a change.

I'll need to check if I know someone with a rig that can fit my CPU to see if the problem also accures there.
 

Attachments

  • WhatsApp Image 2022-07-24 at 11.29.28 PM.jpeg
    WhatsApp Image 2022-07-24 at 11.29.28 PM.jpeg
    413.8 KB · Views: 328
Last edited:

32GB (2x16GB) Corsair Vengeance RGB PRO DDR4-3600 RAM CL18 (18-22-22-42) Kit​

https://www.cyberport.at/marken/corsair.html


Apperantly XMP was already disabled.
Just enabled it. See if that makes a change.

I'll need to check if I know someone with a rig that can fit my CPU to see if the problem also accures there.
Enabling it won't improve stability.
Sounds like a dodgy CPU but it can't hurt to update your motherboard BIOS to the latest version in case there is an issue with the version you're running.

Honestly though, the cache hierarchy error really is something you only tend to see when the CPU is faulty. It's not a symptom of something else like bad RAM or GPU, it's literally a CPU hardware error.

If its only rebooting twice a day then it's marginal and you may find you can correct for it by throwing a lit more voltage at the CPU - but the fact you're getting errors at stock speeds means you should probably RMA it while you still have a warranty, provided you can prove it's the CPU at fault.
 
Last edited:
Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.
 
Hey guys,

Since weeks i've had problems with my pc. Randomly shutting down (90% of the time with a BSOD).

CPU: AMD Ryzen 7 5800X 3.80 GHz
Motherboard: MSI MPG X570 Gaming Edge Wifi
GPU: AMD Radeon RX 5700 XT Red Devil 8GB
Memory: Corsair DIMM 32GB DDR4-3600 (2x 18GB)

Samsung SSD 860 EVO 1TB
NVMe Samsung SSD970 SCSI Disk Device
KINGSTON SA400S37480G

First time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 0

Second time today:
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 14

Anyone that could help?
Updated my BIOS last week to current version.
CPU & GPU drivers are up to date.

WHEA Event ID 18

Error Type: Cache Hierarchy Error

Indicates some issue with the memory subsystem (DRAM <> MemoryController=UMC <> DataFabric=IF=InfinityFabric

Usually this type is caused by OCing all related parts (DRAM <> UMC <> IF), some times on XMP/DOCP profile as well. Never saw it on default 2133MHz though or I dont remember it anyway...
Some times a combination of DRAM, Board, and CPU can cause this also, even if no OC (beyond XMP) is achieved.
I'm not excluding a faulty CPU as the UMC and IF are parts of the CPU package.

Can you post screenshot of the latest ZenTimings in both situations? (XMP on/off)

MCLK is DRAM speed
FCLK is IF speed
UCLK is UMC speed

1658734226216.png
 
Last edited:

WHEA Event ID 18

Error Type: Cache Hierarchy Error

Indicates some issue with the memory subsystem (DRAM <> MemoryController=UMC <> DataFabric=IF=InfinityFabric

Usually this type is caused by OCing all related parts (DRAM <> UMC <> IF), some times on XMP/DOCP profile as well. Never saw it on default 2133MHz though or I dont remember it anyway...
Some times a combination of DRAM, Board, and CPU can cause this also, even if no OC (beyond XMP) is achieved.
I'm not excluding a faulty CPU as the UMC and IF are parts of the CPU package.

Can you post screenshot of the latest ZenTimings in both situations? (XMP on/off)

MCLK is DRAM speed
FCLK is IF speed
UCLK is UMC speed

View attachment 255847
I've never overclocked any of the parts, since I don't think it's necessary with these parts.
After the first couple of BSOD's I read somewhere that underclocking the CPU could work, but it only caused the PC not to reboot anymore after multiple different settings and tests.

I still have a Ryzen 5 3600 from my old rig.
I could test if that one works. Only need to buy some new cooling paste.

Hereby the 2 screenshots.

Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.
No OC. Only some underclocking after the first couple of BSOD's, but then it wouldn't start anymore.
Only tweaks I did where fan related for cooling purposes.

I'll try the Memtest later today.
 

Attachments

  • ZenTimings_Screenshot_XMP_off.png
    ZenTimings_Screenshot_XMP_off.png
    32.7 KB · Views: 244
  • ZenTimings_Screenshot_XMP_on.png
    ZenTimings_Screenshot_XMP_on.png
    32.9 KB · Views: 227
What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.
 
What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.
I have used Corsair Vengeance (not LPX in several builds) and not had the grief that LPX so often causes. Still a small sample size of maybe a dozen machines as I prefer to steer clear of Corsair RAM altogether given my ridiculous RMA rate with LPX.

I don't typically get cache heirarchy errors with bad RAM, but I guess it wouldn't hurt to run an OCCT memtest or even boot to Memtest86 to confirm that the RAM is actually stable. Plenty of Corsair LPX that was faulty failed at JEDEC default speeds and voltage so that part isn't too surprising to me.
 
might wanna consider re-seating RAM and maybe even changing slots.
 
@strikemaker could you fill out your System Specs?

Also what PSU are you powering your system with?

What happened with XMP enabled? Same?

BTW all looks normal in both your ZenTimings shots.
Try to run different CPU with this DRAM/Board combo and also if you can borrow different DRAM sticks to try also with the same 5800X/Board. DRAM between 3200-3600MHz.

Some corsairs are designed for Ryzen but maybe most of them can causing such issues, again with certain CPU/Board combos.

Some RAM yes, but I been using my mixed Geil Dragon DDR4 with Hynix and Samsung B-Die chips running at 3000MHz (Hynix kit speed) and both kits of mine ran stabile with my AMD Ryzen 9 3900X with a Asus ROG Strix B550-A Gaming, and now I am on a Intel Core i7-11700K and a Gigabyte Z590 Vision G which BSOD with either XMP or a manual 3000MHz tune on both my kits and I been talking with Gigabyte support but they won't suggest anything and says that it's a Geil problem if I can't do XMP or manual tuning of my ram.

Maybe I am just lucky with my Geil Dragon DDR4 ram since I only ran them at 3000MHz on AMD Ryzen 3000 series CPU not sure or Asus just did a good job with their bios and B550 board.
 
Any OC, manual Voltage, Curve Optimizer or other tweaks?

if not, run OCCTs Memtest with AVX for at least 30-40 Minutes.
Nothing special happend with the memtest.

Temperatures also stayed way lower then when i'm gaming. CPU could reach 70-75 celsius.
I've ordered some MX4 cooling paste, so I can try the whole setup with a different Ryzen cpu.
 

Attachments

  • OCCT-Screenshot-20220725-134752.png
    OCCT-Screenshot-20220725-134752.png
    131.8 KB · Views: 223
Nothing special happend with the memtest.

Temperatures also stayed way lower then when i'm gaming. CPU could reach 70-75 celsius.
I've ordered some MX4 cooling paste, so I can try the whole setup with a different Ryzen cpu.
if you swap the CPUs take a look at the pins and in the socket. normally a CPU does not die or malfunction that easy.
i know the good ol' idle crashes and similar (which can be worked against with changing the PSU Idle Control to typical instead of auto/low)
but since you get BSODs i don't think that that's the problem.
for the next step i'd reinstall windows.
 
Last edited:
if you swap the CPUs take a look at the pins and in the socket. normally a CPU does not die or malfunction that easy.
i know the good ol' idle crashes and similar (which can be worked against with changing the PSU Idle Control to typical instead of auto/low)
but since you get BSODs i don't think that that's the problem.
for the next step i'd reinstall windows.
I've checked the CPU 2 weeks ago. Pins are okay. Socket is okay. Put new cooling paste on it and that helped for a short period of time i thought.

PSU is 800 watt, more like a overkill then not enough.

Windows is reinstalled as well 2 weeks ago.
 
Try different ram and get back to us, not corsair
 
I've checked the CPU 2 weeks ago. Pins are okay. Socket is okay. Put new cooling paste on it and that helped for a short period of time i thought.

PSU is 800 watt, more like a overkill then not enough.

Windows is reinstalled as well 2 weeks ago.
i own the whole Zen 3 Lineup and some CPUs double.

i had a launch 5800X that crashed randomly and i never fixed it.
you could try to change the PSU idle control.

in the bios should be under CPU config a AMD CBS option.
there is the Power Supply Idle Control Mode which should be on auto.
change that to typical and see what happens. (i have two CPUs that need that on typical to function normally. (one was RMAd and replaced)
 
i own the whole Zen 3 Lineup and some CPUs double.

i had a launch 5800X that crashed randomly and i never fixed it.
you could try to change the PSU idle control.

in the bios should be under CPU config a AMD CBS option.
there is the Power Supply Idle Control Mode which should be on auto.
change that to typical and see what happens. (i have two CPUs that need that on typical to function normally. (one was RMAd and replaced)
After switching this to typical i got this an hour later:

1658780379887.png
 
TPM error? or is that from the crash?
 
Corsair RAM has a long history of not liking Ryzen CPUs. As others have said, get some other RAM to test with. If the same issue occurs with different RAM then the CPU is the issue.
 
it's all from the crash. Now again.

Both times during playing Dota 2

Cache Hierarchy type WHEA is strictly CPU core related. If it was Fabric/memory controller, it would be Bus/Interconnect type (WHEA 19 I think).

Either:
  • the board you have doesn't like your CPU and isn't giving it enough idle or load Vcore,
  • the BIOS you're currently on doesn't like your CPU and isn't it giving it enough Vcore,
  • or your CPU is bad and needs to be RMA'd
Have not yet heard of bent pins causing Cache Hierarchy, but you never know. CPU being bad is rare, but WHEA 18 used to happen quite a bit for earlier production Ryzen 5000 (late 2020 or very early 2021).

In the meantime, you can try some different BIOSes, or up the Vcore a bit by using a slight positive Vcore offset (don't exactly remember how the MSI BIOS setting is laid out), or using positive Curve Optimizer offset on the affected APIC ID cores (looks like it's all over the place so maybe just apply to all cores).

Otherwise, you can try turning Power Supply Idle Current to Typical as mentioned. More drastic measure would be to disable C-states entirely (Global Cstates setting, not DF Cstates).

If all fails, it's time to hit up AMD RMA. They will try to make you jump through troubleshooting hoops, just document what you've already done and be persistent that troubleshooting is useless.
 
Corsair RAM has a long history of not liking Ryzen CPUs. As others have said, get some other RAM to test with. If the same issue occurs with different RAM then the CPU is the issue.
Just switched my RAM back to my oldies:
Curcial Ballistix Sport LT 16GB (2x 8GB) DDR4

Curious what will happen now.
That one always worked together with my ryzen5 2600
 

Attachments

  • ZenTimings_Screenshot.png
    ZenTimings_Screenshot.png
    32.5 KB · Views: 197
Just switched my RAM back to my oldies:
Curcial Ballistix Sport LT 16GB (2x 8GB) DDR4

Curious what will happen now.
That one always worked together with my ryzen5 2600
Is it no XMP enabled (2400MHz) or thats their XMP speed?
 
Back
Top