• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Pc crashes extremely frecuently, WHEA-Logger ID-18

Joined
Jun 12, 2024
Messages
10 (0.02/day)
Location
Spain
Processor Ryzen 7 5700X3D
Motherboard MSI x570 gaming edge wifi
Cooling Thermalright Phantom Spirit 120 EVO
Memory 4x8gb 3200mhz corsair
Video Card(s) PowerColor Hellhound RX 7900XT / RTX 2060 Super
Storage 500gb mnve SN750 2tb mnve Crucial P5 1tb sdd Crucial MX500 sata 4tb hdd Seagate 6tb hdd Seagate
Display(s) Acer Nitro KG241YS 165hz / Samsung SyncMaster 933HD
Case Nox hummer ZX
Audio Device(s) Cosair void elite wireless
Power Supply NZXT C1200
Mouse SteelSeries Rival 3 Wireless
Keyboard 860 Trust gaming
VR HMD Quest 3
Software Win 11 24H2
So, I don't know what even going on with my computer as in late November, it crashed randomly, at first I thought it was nothing, and it didn't happen again until early December it started happening multiple times a day, at first, I was having some ram problems but a reseat did fix the problem temporarily. Then after some days, I had reseated the ram the computer started crashing once again, I did rerun memtest as the PC crashed but it passed with no errors. Then I thought that maybe it was the OC+UV I did to the RX 7900xt. I didn't think of it as I was running with the same OC for about 5 months and I didn't have any problems, so I disabled the oc. I was just watching YouTube when it happened again, it crashed with the same WHEA 18 error, I also did a PSU and a Motherboard swap and it didn't change anything. And once again I did a memtest two days ago and memory is fine. Can somebody help me find out what is causing this?

Also the WHEA errors are Cache Hierarchy Errors and the Id of the processors affected are 13,6,0,3,4,2,11 in a span of 10 days, I got in total 12 of these errors
 
My gut says RAM, still.

I wrote a post, then read the OP, and deleted it. Came back to post again.
I have had memtest pass, then fry a socket, and have had memtest pass, but fails under combined loads like 3D or accellerated video.
 
My first thought was power and was going to suggest you swap in a different PSU. But you already did that.

Note while software-based RAM diagnostic tools are good, none are 100% conclusive. If they report any errors, even one, the RAM is bad. While rare, on occasion they will report no problems, yet the RAM still fails in use, and/or when paired with other RAM. So, swapping in all new RAM is often the best test. But of course, not everyone has a bunch of spare RAM laying around so, if able, you might try running with just a single RAM stick to see if it fails. Repeat process with remaining modules, hopefully identifying the bad stick through a process of elimination. Just be sure to unplug the computer from the wall and touch bare metal of the case interior BEFORE reaching for the RAM to discharge any destructive static in your body.

To conclusively test your RAM, you need to use sophisticated and very expensive test equipment, like this $2,495 Memory Tester (and that's for the cheap model)! So it is usually easier (and cheaper!) to swap in known good RAM and see what happens.
 
Downclock that RAM. Set it to 2400MHz or something and see what happens.
 
Have you tried Turning XMP on and off?
 
try downclocking your ram or loosening the timings, or both.
 
My gut says RAM, still.

My gut also says RAM. The 4x8 could be doing an absolute number on the memory controller.

If its not the ram then its the SSD but I would remove 2x8 first and see what happens.
 
"And once again I did a memtest two days ago and memory is fine."

what is your definition of doing a "memtest"?
at least 18 hours of Karhu or Testmem5 (Anta Absolut or PCBDestroyer) followed by at least a couple hours of Prime95 large FFTs and Y-cruncher VT3 would be the minimum to actually verify stability.
 
Cache Hierarchy errors are almost always CPU Core instabilities. Try downclocking the CPU and see if it fixes it
 
"And once again I did a memtest two days ago and memory is fine."

what is your definition of doing a "memtest"?
at least 18 hours of Karhu or Testmem5 (Anta Absolut or PCBDestroyer) followed by at least a couple hours of Prime95 large FFTs and Y-cruncher VT3 would be the minimum to actually verify stability.
I ran memtest86 for about 4 hours and it didn't gave any errors but I am running testmem5 right now in another computer testing these 4x8gb sticks of ram and I plan to run in for at least a day, for now I am using 2x8gb of ram on PC until I finish testing these sticks.
 
I am going to do that right now. I will update later on what happens

No I haven't tried it, but I am trying now downclocking my ram first.
Also try with 2 sticks at dimms 2/4 (A2/B2) at their XMP profile
 
Cache Hierarchy errors are almost always CPU Core instabilities. Try downclocking the CPU and see if it fixes it
Strangely, it can be unstable VRAM OC on discrete video cards as well! Also, it may be due to heat coming on. (Reminds me of when I had my MSI B450 Tomahawk and Ryzen 7 3700X in my first bedroom that I had in the current house, where it seemed to crash multiple times in a short time when heat was on in my room, most likely. I thought the room was extra warm!) (2020 and 2021)

The forced-hot-water baseboard radiator was very close to the tower at that time. Before I moved to the current bedroom in the same house, during November, 2021.

Also for system RAM, TestMem 5 with anta777 preset is the minimum for me!

Return the VRAM to stock on the RX 7900 XT, temporarily.
 
Last edited:
Event ID 18 is an uncorrectable hardware error, as opposed to Event ID 19 (or is it 17?) which a soft correctable hardware error and means the issue but there but not yet severe enough to stop the operating system/PC.

Are these crashes a loss of video to a black display until an eventual restart (sometimes nearly immediate, sometimes seemingly frozen on the Black display)?

A Ryzen system will often term it as a cache hierarchy error a bus/interconnect error. The latter is more descriptive in where the issue may be whereas the former is more of a guessing game (at least if the APIC ID varies as in your case). This just means the machine check error was discovered by and within the CPU, not necessarily that it's the cause. It could be a power issue, a CPU issue, RAM issue, motherboard, or graphics card, so just about anything. It can be software insofar as BIOS/drivers but it's usually not typical user level software causing it (like the OS and its settings or applications) so messing with OS reinstalls usually gets nowhere with this; machine check exceptions are usually a hardware issue.

I had the same issue show up on my Ryzen system after swapping a then-new 7800 XT in after it was stable with the previous video card. It only occurred in medium to heavy GPU related things, but never light ones (like hardware accelerated web browsing). I did a plethora of tests from swapping the CPU, the graphics cards, removing half the RAM, disabling RAM profile speeds (which actually made the issue worse and occur in things it wasn't occurring in before), reducing GPU side clock speeds (made no improvement), and literally everything else I could think of. Only swapping the graphics crad and then an eventual RMA resolved it.

OCCT's test suite proved useful in narrowing it down further. The "GPU variable" was the one that could cause it for me, and not even always.

You're on the right track by investigating things you recently changed. If the graphics card overclock isn't it, then I'd also echo the sentiments to "lighten the RAM configuration" by removing half the DIMMs and setting it to stock speeds.
 
Event ID 18 is an uncorrectable hardware error, as opposed to Event ID 19 (or is it 17?) which a soft correctable hardware error and means the issue but there but not yet severe enough to stop the operating system/PC.

Are these crashes a loss of video to a black display until an eventual restart (sometimes nearly immediate, sometimes seemingly frozen on the Black display)?

A Ryzen system will often term it as a cache hierarchy error
When I had that "bombarded by one after the other" incident, it would do a fast black-screen-reboot. Often when launching a game. I think it happened when I wasn't ducting outside air in and had the radiator near the tower at that time, because I was cold, LOL.

IIRC, especially Ryzen, does that with unstable dedicated VRAM OC. (not GPU core)
It's often unstable "MCLK" on Radeon RX.
 
Also the WHEA errors are Cache Hierarchy Errors and the Id of the processors affected are 13,6,0,3,4,2,11 in a span of 10 days, I got in total 12 of these errors
Warranty the CPU.
GL
 
So, I don't know what even going on with my computer as in late November, it crashed randomly, at first I thought it was nothing, and it didn't happen again until early December it started happening multiple times a day, at first, I was having some ram problems but a reseat did fix the problem temporarily. Then after some days, I had reseated the ram the computer started crashing once again, I did rerun memtest as the PC crashed but it passed with no errors. Then I thought that maybe it was the OC+UV I did to the RX 7900xt. I didn't think of it as I was running with the same OC for about 5 months and I didn't have any problems, so I disabled the oc. I was just watching YouTube when it happened again, it crashed with the same WHEA 18 error, I also did a PSU and a Motherboard swap and it didn't change anything. And once again I did a memtest two days ago and memory is fine. Can somebody help me find out what is causing this?

Also the WHEA errors are Cache Hierarchy Errors and the Id of the processors affected are 13,6,0,3,4,2,11 in a span of 10 days, I got in total 12 of these errors
Do a fresh BIOS.
 
I had yesterday two crashes while running the 2x8 gb sticks, I did right now a CPU swap, gonna send the r7 5700x3d for warranty. for now, I am using a Ryzen 7 3700x and I haven't gotten any crashes for now as I swapped the CPU.
 
Back
Top