• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Seemingly random BSODs/Crashes while gaming

bobshaniqua

New Member
Joined
Jan 3, 2022
Messages
8 (0.01/day)
Over the past week, I have had issues with my PC either bluescreening or just outright crashing while playing video games.

My system began crashing with bluescreens on the 29th of December, however there have been several more recent occasions where a BSOD has not happened (thus no memory dump).
Crashes/bluescreens only seem to happen while playing video games, particularly the game Escape from Tarkov. It has also happened during a game of Halo. I played these games fine before the 29th Dec.
The system has not crashed while idle.
The crashes are completely random during games. Sometimes they can happen within 30 minutes, other times after hours of playing games.
Once a crash has happened, the system reboots itself and becomes noticeably more unstable, where crashes can happen again soon after. Powering off at the PSU, waiting then restarting seems to fix the issue
until the next crash when playing games.
One time after a crash, the system could not boot into windows past the bios. It kept restarting in a loop trying to enter windows repair mode. I could still enter the bios. Powering off and on at the PSU fixed this.

List of bluescreen errors I have had so far in order of occurrence:

0x0000001A MEMORY MANAGEMENT

0x00000021 QUOTA_UNDERFLOW

0x0000000A IRQL_NOT_LESS_OR_EQUAL

Hard to say whether these are causes or symptoms of the issue. They all seem to be memory related. I only have the memory dump saved of IRQL_NOT_LESS_OR_EQUAL.

Things I have tried so far:
- Re-seated GPU
- Re-seated RAM
- Monitored temperatures during stress tests and games - all ok
- Ran memtest86 - Pass
- Tried to replicate crashes with stress tests on GPU & CPU - Nothing
- Updated GPU drivers


System specs:
Ryzen 9 5950X
Gigabyte Vision RTX3080 Ti
ASUS ROG Crosshair VIII Hero (Wi-Fi) ATX Motherboard
64 GB Corsair Vengeance RGB PRO SL (4x16GB) @ 3600Mhz XMP
Corsair 850W RM850x PSU
Samsung 980 Pro 1TB SSD - Windows boot drive
Samsung 980 Pro 2TB SSD - Secondary drive
Corsair H150i PRO RGB 360mm AIO
Windows 10 Pro 19043.1415

There are two events that happened before this that may give indication of cause:

1) Upgraded from 1080p monitor to 1440p monitor on the 25th Dec. Crashes started on the 29th Dec.

2) Now here's a kicker and it sounds bad because it is but: I got water in my PC.

My PC is sat on the floor as I am awaiting a desk that still has not arrived. I somehow managed to drop a water bottle in my lap that squirted water perfectly onto the top vent of my pc.
I immediately turned my PC off and dried as much water as I could. After inspection, droplets seemed to all land in the same place on the top of the heat shroud of my GPU,
with a few small droplets that bounced from there and landed on the motherboard shrouding.
I dried everything I could see with tissue and inspected my PC to make sure. I also re-seated my graphics card to inspect it. I rebooted and everything seemed ok. The amount of water that got into my system was not extensive.

This happened on the 28th of Dec. Crashes started on the 29th.


Now this could very well be the cause of the issues I am having, but the important thing for me is finding the cause of the issue, I can deal with paying for fixes and repairs.
The problem is this issue is very hard to replicate - there doesn't seem to be a trigger other than play intensive games for X hours. I have a second PC I can pinch parts from to do some fault finding, but this could take ages given
no guaranteed method of causing a crash.

Does anyone have any ideas on what tests I can run, things I can try, or any probable culprits in order for me to get to the bottom of this issue? Any answers would be greatly appreciated. Thank you.
 
Last edited:
Do you know if some water entered in psu?

Intermittent issues are the worsts things to solve, if your parts are in warranty, I will return the vga, mobo and psu.
 
"0x0000000A IRQL_NOT_LESS_OR_EQUAL" was always an unstable CPU core in my experience, only seen by me when overclocking CPU core(s). Including Linpack failing one day, when I was overclocking a Core 2 CPU for the first time, at or around 2.8 Ghz on my E2180 then it was fine after bumping up the Vcore. That was the same one where I was lucky to get the low 2.9s.

And that one would be guaranteed to happen when running a game, usually, with my T-bird 900@1050 Mhz back in 2002. I couldn't get my T-bird 900 stable at 1050 Mhz. (early socket 462 Athlon)
 
Last edited:
"0x0000000A IRQL_NOT_LESS_OR_EQUAL" was always an unstable CPU core in my experience, only seen by me when overclocking CPU core(s).
Like I said it's hard to tell whether this is the cause of the crash or merely a symptom caused by another unstable component. I do not have any overclocks on my system, only XMP enabled on RAM.

Do you know if some water entered in psu?

Intermittent issues are the worsts things to solve, if your parts are in warranty, I will return the vga, mobo and psu.
The case I have is a Lian Li O11 XL, one where the PSU sits behind the motherboard, therefore I don't see any way water could have gotten into it.

I am fine with returning parts to warranty but I'd prefer to be able to isolate the component responsible which is my main issue I am having.
 
Those m.2 drives are dried when this happened?
 
Right now I am leaning toward memory. What happens when you pull half the ram out and clear the cmos and try again? Another thing that sticks out is the PSU.. with my system I can come close to maxing my 750, I would imagine you can do the same too..
 
Another thing that sticks out is the PSU.. with my system I can come close to maxing my 750, I would imagine you can do the same too..

I think this may be the issue - 3080Ti and the move to 1440p from 1080p may have pushed GPU power usage (and spikes) above what the 850w may be able to provide. Maybe drop back to 1080p and see if the problem continues.

When I moved from a 6800 to a 6900xt with a 5950x I also moved from an 850w to 1000w.
 
Spells an issue with RAM to me as well.

The simplest way to tell is to turn off XMP and see if the issues are still there. Run the sticks at 2133/2400/2666, whatever the SPD is set for.

On the flip side, it could simply require a VDIMM bump or the SOC could need some more volts.
 
I think this may be the issue - 3080Ti and the move to 1440p from 1080p may have pushed GPU power usage (and spikes) above what the 850w may be able to provide. Maybe drop back to 1080p and see if the problem continues.

When I moved from a 6800 to a 6900xt with a 5950x I also moved from an 850w to 1000w.
3080 Ti can easily pull 300W when running at 1080p.

My bet is on memory, since it's 4x16, corsair and xmp at 3600.
 
Right now I am leaning toward memory. What happens when you pull half the ram out and clear the cmos and try again? Another thing that sticks out is the PSU.. with my system I can come close to maxing my 750, I would imagine you can do the same too..

I think this may be the issue - 3080Ti and the move to 1440p from 1080p may have pushed GPU power usage (and spikes) above what the 850w may be able to provide. Maybe drop back to 1080p and see if the problem continues.

When I moved from a 6800 to a 6900xt with a 5950x I also moved from an 850w to 1000w.

Spells an issue with RAM to me as well.

The simplest way to tell is to turn off XMP and see if the issues are still there. Run the sticks at 2133/2400/2666, whatever the SPD is set for.

On the flip side, it could simply require a VDIMM bump or the SOC could need some more volts.

3080 Ti can easily pull 300W when running at 1080p.

My bet is on memory, since it's 4x16, corsair and xmp at 3600.

Having read what you guys have said, I decided to run Prime95 CPU & RAM stress test along with FurMark stress test at 1440p for over a full hour straight. Not a single crash.

I'm not so sure about it being XMP as I have had it enabled since August when i built my system and I had 0 issues up until now with it.

Sure, I could run the test for longer but it consumes a lot of power and heats up my room a lot. If it is unreproducible in this state for this long, I am not sure how I can figure this out.
 
Ryzen and memory can be kinda funny sometimes. If it happens again just refer to your thread and try some things the guys mentioned. This site is full of smart cookies..
 
Escape from Tarkov - last time I played it (was 8 or so months back) it kept giving me random crashes. It's the only game that I've played that's randomly crashed my 5900x since I updated my BIOS to one using AGESA 2.0.x (i think it's 2.0.1, might be 2.0.0). Before my BIOS update I got constant crashes on pretty much any game I played.

I wasn't a big fan of Escape from Tarkov, so I stopped playing it. I haven't had a BSOD gaming since. And the odd part was, I could play the game for a few hours, no crash and even play it more off and on over a handful of weeks.....then all of a sudden I run into a string of crashes while playing the game. Then I could go days or weeks without a crash, then a string of crashes would happen again and only when playing Escape from Tarkov. No other game since I updated my BIOS has caused a system crash.

I'm not saying that your case is the same as mine, but it sounds kind of similar. I'm not sure what the exact problem was, but I just stopped playing Escape from Tarkov because of it.
 
Having read what you guys have said, I decided to run Prime95 CPU & RAM stress test along with FurMark stress test at 1440p for over a full hour straight. Not a single crash.
and not one of those is for RAM testing. sure, cpu, imc on the cpu and PSU (furmark is useless!!!) but not memtest86.
 
Having read what you guys have said, I decided to run Prime95 CPU & RAM stress test along with FurMark stress test at 1440p for over a full hour straight. Not a single crash.

I'm not so sure about it being XMP as I have had it enabled since August when i built my system and I had 0 issues up until now with it.

Sure, I could run the test for longer but it consumes a lot of power and heats up my room a lot. If it is unreproducible in this state for this long, I am not sure how I can figure this out.
In my case running unstable ram OC on Ryzen would result in random hard resets when playing Squad or anything memory intensive. You can easily verify this with Memtest64 or something similar. Also, try running chkdsk, sfc scannow etc to verify system integrity.
 
In my case running unstable ram OC on Ryzen would result in random hard resets when playing Squad or anything memory intensive. You can easily verify this with Memtest64 or something similar. Also, try running chkdsk, sfc scannow etc to verify system integrity.
It's possible that unstable RAM, (including VRAM) causes Cache Hierarchy Error WHEA errors, especially on Ryzens.
(A good chance Windows will just reboot itself and log that error, instead of you seeing an SOD on the monitor)
 
Zen2/Zen3 and those error messages mean unstable RAM timings or faulty RAM.

Drop down to JEDEC defaults by disabling DOCP or XMP and run Memtestx86 for at least one full cycle. If you get no issues, boot back into Windows and run like that for a day or two. Yeah, it'll be a bit slower at the 2133/2400MHz defaults but it'll still be 100% usable and if it runs without crashes like that then you've found your issue.
 
Zen2/Zen3 and those error messages mean unstable RAM timings or faulty RAM.

Drop down to JEDEC defaults by disabling DOCP or XMP and run Memtestx86 for at least one full cycle. If you get no issues, boot back into Windows and run like that for a day or two. Yeah, it'll be a bit slower at the 2133/2400MHz defaults but it'll still be 100% usable and if it runs without crashes like that then you've found your issue.
Unstable VRAM, possibly just because of the video card getting warm, can cause the same error! Sometimes, especially in a warm room, Windows 10 reboots itself and logs that error when just trying to get into a game!

That sometimes happened with my RX 5600 XT. Even a stable VRAM OC, suddenly becomes flaky and I think I traced it to only being able to handle super-low-temps! After that card got bricked for real, I autopsied it and I did find Micron VRAM. IIRC a person over at Igor's Lab, said that it has Micron VRAM, but never saw myself, until I tore the card down for scavenging thermal pads, mostly.

Does more recent Micron VRAM usually have that kind of characteristic? Needs low temps or it ends up being unstable, even when nowhere near critically high temps?
 
Last edited:
and not one of those is for RAM testing. sure, cpu, imc on the cpu and PSU (furmark is useless!!!) but not memtest86.

Zen2/Zen3 and those error messages mean unstable RAM timings or faulty RAM.

Drop down to JEDEC defaults by disabling DOCP or XMP and run Memtestx86 for at least one full cycle. If you get no issues, boot back into Windows and run like that for a day or two. Yeah, it'll be a bit slower at the 2133/2400MHz defaults but it'll still be 100% usable and if it runs without crashes like that then you've found your issue.
As I stated in my original post, I ran a full memtest86 run (lasting 7 hours) with zero errors or issues. This was with XMP enabled.

Is there something else that may reveal the instability issue?

Default speeds for my RAM are 3200Mhz with 3600Mhz being the factory XMP.



Unstable VRAM, possibly just because of the video card getting warm, can cause the same error!
This card is known for having hot VRAM modules due to poor thermal interfacing. Looking at HWinfo, my GPU Memory Junction temperatures can reach up to 90C. From what I can tell however it can manage at these temperatures.
 
As I stated in my original post, I ran a full memtest86 run (lasting 7 hours) with zero errors or issues. This was with XMP enabled.
i didn't reply to your OP but it still doesn't mention how long or what speeds (XMP). based on what i replied to: NO, none of those test would validated what you're looking for, try 8 hours A STICK.

you got a long ways to go.
 
i didn't reply to your OP but it still doesn't mention how long or what speeds (XMP). based on what i replied to: NO, none of those test would validated what you're looking for, try 8 hours A STICK.

you got a long ways to go.
Okay then. In which case I will make sure to do 10 or 12 passes this time. Hopefully will be done by the time I get home. Better to run it collectively first then if I see issues, per stick.
 
As I stated in my original post, I ran a full memtest86 run (lasting 7 hours) with zero errors or issues. This was with XMP enabled.
My bad, I CTRL+F'ed for "x86"

The RAM timings by far the most likely culprit for the list of symptoms and error messages you are seeing.

The cause for those RAM timing issues could be a BIOS update, Software (like Ryzen Master or something in Asus Armoury Crate), degredation of your CPU's IMC, a FCLK divider being set wrong or... The list of causes is pretty big but the general troubleshooting is to rule out all of those things.

Memtestx86 is a good way to find bad RAM timings but it's not perfect - I've definitely experienced unstable timings that cause crashes and pass Memtestx86 for an overnight run.

The thing that makes me suspicious is that you say your RAM defaults to 3200 - that is not what Corsair say. The defaults should be 2133MHz if this is your RAM. If you are running them at 3200MHz with SPD default timings I would 100% expect that to be unstable. Its this RAM, right?

I'd start with a BIOS reset to defaults, check that your RAM is running without XMP at the SPD defaults of 2133MHz, and uninstall any tuning software. If you want to check full RAM timings in Windows you can either use Thaiphoon Burner to do a full export or just use CPU-Z for a quicker, dirtier look at primary timings if you have that already installed.
 
Okay then. In which case I will make sure to do 10 or 12 passes this time. Hopefully will be done by the time I get home. Better to run it collectively first then if I see issues, per stick.
you get a memory error but yet your testing concludes there is no memory error?

sorry something is likely (but not necessary) wrong with your testing.
 
I would test in windows. Maybe with something a little more aggressive, like TM5, specifically the Anta Extreme profile. You have 64GB of ram, its gonna take awhile. 32GB takes awhile.. the drawbacks to running high mem counts..
 
you get a memory error but yet your testing concludes there is no memory error?

sorry something is likely (but not necessary) wrong with your testing.
Sorry. I'm not sure if there has been a miscommunication somewhere but I haven't had any confirmation that this is a memory error. I'm still not sure what it is. It just seems to be what the running theory is in this thread at the moment. So far my memtests have passed, but I will do more testing.


My bad, I CTRL+F'ed for "x86"

The RAM timings by far the most likely culprit for the list of symptoms and error messages you are seeing.

The cause for those RAM timing issues could be a BIOS update, Software (like Ryzen Master or something in Asus Armoury Crate), degredation of your CPU's IMC, a FCLK divider being set wrong or... The list of causes is pretty big but the general troubleshooting is to rule out all of those things.

Memtestx86 is a good way to find bad RAM timings but it's not perfect - I've definitely experienced unstable timings that cause crashes and pass Memtestx86 for an overnight run.

The thing that makes me suspicious is that you say your RAM defaults to 3200 - that is not what Corsair say. The defaults should be 2133MHz if this is your RAM. If you are running them at 3200MHz with SPD default timings I would 100% expect that to be unstable. Its this RAM, right?

I'd start with a BIOS reset to defaults, check that your RAM is running without XMP at the SPD defaults of 2133MHz, and uninstall any tuning software. If you want to check full RAM timings in Windows you can either use Thaiphoon Burner to do a full export or just use CPU-Z for a quicker, dirtier look at primary timings if you have that already installed.

Apologies. I may be wrong about the speeds. I am just going off of memory. I will double check later. Yes it is that RAM kit (the white version but there seems to be no difference).

I can try what you say, but I am struggling to recreate the crashes at the moment. I haven't had a single one today (yet).
 
Yeah, Memtestx86 is extremely thorough but it's also not really a stress test, it's more of a test for faulty silicon. Timing faults aren't always caught out by it so as @freeagent says; TM5 on anta777 profile.

If it's passing anta777 and you're getting very infrequent crashes, you should maybe check your mains power delivery. Old power strips or failing surge-protectors can do weird things that cause clock instability in otherwise stable systems. Perhaps you have something noisy plugged in on the same circuit like a fan or motorised gadget, or perhaps you have some arcing in that circuit that's causing occasional brownouts to exceed your PSU's hold-up time.

I dunno man, you've actually done most of the obvious stuff and I've been doing this profesionally for 20+ years so my "obvious" list is pretty damn comprehensive. Confirm it's BIOS defaults and 2133 and run TM5 anta777. Do you by any chance have another RAM kit to run for a few days?
 
Back
Top