All 4 Ram Sticks Suddenly have Errors, BSOD @ Startup

ir_cow · Nov 3, 2023

freeagent said:
Are you sure about that? I have not seen one person complain about B-Die on AM4.. As an outsider looking in, Samsung were the only ones to get it right?

Yes I distantly remember Ryzen 1000 series launch and pretty much nothing work but certain QVL kits. All because of the ICs used. This was before B-Die think existed I think. It was patched in later BIOS updates and they only thing holding back now is the max freq. Plus its a old CPU by todays standards.

lexluthermiester · Nov 4, 2023

Assimilator said:
Can y'all please stop insulting OP's intelligence by telling them to update BIOS or tweak voltages? Four memory modules going bad AT THE SAME TIME is astronomically improbable and thus very obviously a hardware failure - nothing that they can do will fix that. And the testing they've done verifies the modules are now bad.

OP, you've done excellent troubleshooting (pretty much the same that I would in this situation), and based on that it definitely sounds like something in your system has killed these modules. The obvious culprit is the CPU's memory controller (IMC), but "why" is a question that is going to be difficult to answer - especially since you said you tested with a different set of modules on the same CPU and things worked. My guess would be that there's a defect in your CPU IMC's silicon, that only comes into play when 4 modules are present i.e. highest load on the IMC, and over time that high load has caused that defect to worsen to the point where eventually it became bad enough that the IMC "slipped" and put too much voltage or current through your modules. The alternative is that your CPU is fine, but a cosmic ray caused the IMC to misbehave and kill the modules (no this is not a joke).

Either way, the safest option now is to RMA the CPU, because unfortunately it can't be trusted.

SlvrSurfer said:
Wow, thank you for that! Yes, I thought that could be the problem at first when I did my initial troubleshooting steps. I was happy to try out all of these BIOS settings so I wouldn't have to start RMA'ing components, but it sounds like contacting AMD is the proper thing to do at this point. Totally crazy about the cosmic ray thing, hopefully that's not what happened!

Once again, I have to agree with @Assimilator here, your testing indicates a CPU integrated memory controller problem. This was likely because of over-volting(which you may not have known about), but that is only my theory. Your CPU and RAM are likely both unrecoverable. RMA is your best option in each case.

Aquinus · Nov 4, 2023

lexluthermiester said:
Your CPU and RAM are likely both unrecoverable.

I wouldn't assume that the RAM is done for. The OP needs to rule the CPU out first. If the IMC took out the RAM, I wouldn't even expect the machine to even boot.

TheoneandonlyMrK · Nov 4, 2023

I agree with the last few posts.

But no one asked, 4x 16GB stick's,

Were they bought as a quad validated set, or were they two dual stick sets.

Because only a validated bought as a quad set should have been run at xmp settings since the timings will be different for two paired sticks then four.

I have experience with this, buy a quad set OR manually set OR auto set timings IMHO.

As I said I agree it's probably t late now but this is worth mentioning in case you didn't know for next time.

lexluthermiester · Nov 4, 2023

Aquinus said:
I wouldn't assume that the RAM is done for. The OP needs to rule the CPU out first. If the IMC took out the RAM, I wouldn't even expect the machine to even boot.

They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.

TheoneandonlyMrK said:
Were they bought as a quad validated set, or were they two dual stick sets.

That really shouldn't matter. I install mixed sets frequently with no ill effects.

TheoneandonlyMrK · Nov 4, 2023

lexluthermiester said:
They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.

That really shouldn't matter. I install mixed sets frequently with no ill effects.

That depends what timings were set(in xmp), if just the first four then it often doesn't indeed matter, some sets go further.

And some settings move upwards with four sticks instead of two.

I've Rma'd three sets of ram over the years and seen issues.

SPD tables can tell you , TrFc set to auto for example was required for 2X2x8GB Corsair Vengeance 3600 kit I had last for example.

Fine on two sticks, issues on four.

5### series Ryzen has good memory support in general these days, but it can't defy physical differences and more rows and pages to read write refresh equals longer total cycle times etc

If those settings are auto obv less of an issue, if xmp defines them, issue possible.

Aquinus · Nov 4, 2023

lexluthermiester said:
They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.

I just re-read the thread and the OP has not tested the sticks in another system AFAICT and has only tested a different set of DRAM on the current machine.

SlvrSurfer said:
I did test two sticks of different Corsair Vengeance Pro 8gb 3200hz in the proper slots earlier, and I did not get any errors.

This doesn't give me much confidence that it's the CPU or the RAM that's at fault. It could still be something as simple as timings being off for one reason or another. I still find it incredibly unlikely that all the DIMMs failed unless they've been running super hot or over-volted for extended periods of time.

This might be a really dumb suggestion, but I'm going to ask anyways. Since all of this started occurring, have you tried resetting the BIOS to all stock settings and testing again? If it's stable at stock, I would gradually start changing settings and doing a memory test after each change that could impact the memory.

SlvrSurfer said:
BSOD - Stop Code: IRQL_NOT_LESS_OR_EQUAL

I've seen this occur when CPU overclocks are unstable, not only when memory is unstable.

freeagent · Nov 4, 2023

It is pretty hard to kill a CPU, you have to really go out of your way and try. I don't think people understand just how tough they really are..

I am not saying it cannot happen.. but I do not think it is a CPU problem. Heck, I have seen a Molex grounding out cause all kind of stability issues. It was to the point I thought my 5900X really was on its way out. Nothing was stable, no settings. Everything thing would fail, with good voltage showing in hwinfo..

It could be many things.. at this point he has to do some digging.

ir_cow · Nov 4, 2023

SlvrSurfer said:
I did test those same sticks on my friend’s system and I was seeing errors on all 4 sticks. I also tested a spare set of ram on my rig and there was no errors and Windows booted without any issues.

Here is key information people missed...

No point in RMAing RAM if isnt the problem. However, what you are saying it doesn't work on another computer and another set does.

Aquinus · Nov 4, 2023

ir_cow said:
Here is key information people missed...

I'm not sure how I missed that. That would seem to indicate that the RAM is bad. I just have a hard time believing that all 4 sticks are faulty. The odds of that are incredibly small unless the OP just went YOLO and cranked the DRAM voltage to high heavens and roasted the DIMMs.

ir_cow · Nov 4, 2023

Aquinus said:
I'm not sure how I missed that. That would seem to indicate that the RAM is bad. I just have a hard time believing that all 4 sticks are faulty. The odds of that are incredibly small unless the OP just went YOLO and cranked the DRAM voltage to high heavens and roasted the DIMMs.

oh I agree. Never had 4 bad sticks on there own.

Aquinus · Nov 4, 2023

ir_cow said:
oh I agree. Never had 4 bad sticks on there own.

I'd even go further and say that I've never lost more than a single DIMM at once. I've never had a pair fail together.

I still think the OP should try resetting the BIOS to completely stock settings and try again without any additional changes. If it's stable at stock, it's entirely possible something else is going on.

davidm71 · Nov 4, 2023

SlvrSurfer said:
Hello, I am having an issue where all 4 of my 16GB DDR4 3200mhz sticks now have errors when running memtest and computer fails to load without crashing quite often when I reset. I verified the errors on all sticks by testing them on another working PC. I am at a loss of how this could have happened, and when I first purchased these sticks a few months ago (Corsair Vengeance Pro), I did run memtest, and received zero errors with 4 passes. I am worried that if I get new modules, that they can become damaged as well, perhaps due to a faulty motherboard or maybe CPU. I do live in Florida, and I do have a whole home surge protector and I did not notice any other symptoms of a surge, but that is a possibility.

Any ideas how to fix this? Any way to diagnose the motherboard or CPU memory controller? Thanks for reading!

Some more details-

- Computer is stable when gaming, and when using high memory usage programs like DCS & Flight Sim 2020

- Computer only has issues when restarting where it would crash as soon as it starts to load Windows - BSOD - Stop Code: IRQL_NOT_LESS_OR_EQUAL What Failed: ntoskrnl.exe

- When running Ram in stock mode (no overclock), it will load further into Windows, and after a couple of crashes, it does load, then is stable

- When running Ram in XMP (or whatever AMD calls it), I have been able to successfully load windows and run said programs as above by removing all ram, and reseating it.... don't know if that's just a placebo

- Memory errors from memtest do occur no matter if ram is overclocked or not.

- Have tried each individual stick in all slots. Errors are consistent throughout

Specs-

4x 16gb 3200mhz DDR4

5800x3D

Asus B550-F Mobo

Nvidia 4090

Super Flower Leadex III Gold 850

Dump File- https://www.mediafire.com/file/9wfi34p5cy66r10/dumps.zip/file

Have you tried increasing the Vsoc voltage? Or try dropping the tCl settings to a lower rung? Lax your secondary timings?

Have a system that suddenly one day few months back became totally unstable. Crashed often. Had memtest errors. Had to lax my cas timings from 15 to 16. Only way.

My memory controller on my 9900k just couldn't handle it anymore. Maybe same thing.

lexluthermiester · Nov 5, 2023

ir_cow said:
Here is key information people missed...

Ninja'd..

freeagent said:
It is pretty hard to kill a CPU, you have to really go out of your way and try.

This is true. The thing is, with the early generations of the Ryzen line, motherboard makers were setting voltages WAY too high. I've seen a few burned out Ryzen chips and every single time, it was the voltage that was out of whack. Of course it's only been once or twice a year, so yeah it's very uncommon but it does happen.

ir_cow said:
oh I agree. Never had 4 bad sticks on there own.

I saw this happen once with a CPU from a socket 1366 board, but again, it was the voltages that killed the CPU & RAM. Too much voltage will kill things.

LabRat 891 · Nov 5, 2023

I've had a similar scenario happen to me w/ Crucial Ballistix DDR2 (Vista era).
Issue there was RAM being binned wrong: the chips used were not as voltage/heat-tolerant as needed for the marketed voltage.
-ended up going thru 2 sets, both ran @ 'advertised stock'.
Note: Some motherboards run DIMM and/or IOD voltage a lil high, too.

As Assimilator mentioned, the CPU/IMC is suspect.
However IMO, all 4 'going bad' would/could implicate the motherboard. (ex. bad caps or malfunctioning RAM power phases.)

Also, any lightning strikes or severe power surges recently?
Years ago I've had a nearby lighting strike decimate multiple USB-connected USrobotics 56K modems.
Through my 'retail tech work' I'd also seen more than a couple PCs 'ded' after lightning strikes. (Not just the PSU, either)

All that said (and related to my last commentary), some mobos will 'load compensate' voltages on the CPU/SoC's IMC too.
(which, AFAIK too high of IOD/IMC voltage has been implicated in both damaged RAM and damaged IMCs)

edit:
from your testing with the RAM in another PC, and 'other RAM' working fine:
I'd guess the sticks 'degraded' under voltage/thermal conditions (possibly related to mobo LLCing high), or they were killed in a single 'freak' incident (lightning, etc.).

chrcoluk · Nov 5, 2023

To me this points to a MB or IMC issue based on all 4 sticks tested bad. If its having issues at out of the box settings, so bios reset, no XMP, no voltage adjustments etc. I would be looking to prove either the CPU or the board, if neither can be proven due to lack of spare parts I would go with RMA'ing the CPU. The RMA team will probably then test the IMC on your behalf and confirm if its the issue, if it is you get a replacement, if it isnt then RMA the board.

RJARRRPCGP · Nov 5, 2023

Looks like the CPU has wonky stability issues. The symptoms remind me of what I would expect with bad CO settings.

A Computer Guy · Nov 5, 2023

RJARRRPCGP said:
Looks like the CPU has wonky stability issues. The symptoms remind me of what I would expect with bad CO settings.

The OP did mention that they had -30 CO on the CPU. I am curious though on the other system where the ram also didn't work if it was also the same scenario with CO?

Aquinus · Nov 5, 2023

LabRat 891 said:
I'd guess the sticks 'degraded' under voltage/thermal conditions (possibly related to mobo LLCing high), or they were killed in a single 'freak' incident (lightning, etc.).

LLC is an interesting one. Depending on how you tune it, voltage changes could spike. Mix that with shoddy DRAM voltage control and you might be exposing DRAM to high voltages.

After sleeping on this, I'm starting to think that this is a configuration problem that resulted in some dead DIMMs. Be it purposefully or not, but I guess that the DRAM profile for the RAM might be translating to too much voltage.

SlvrSurfer said:
- When running Ram in XMP (or whatever AMD calls it), I have been able to successfully load windows and run said programs as above by removing all ram, and reseating it.... don't know if that's just a placebo

Looking for clarity on this point. Have you been running the default XMP/EXPO profile this entire time? If you have, could you tell us what settings it is applying for the DRAM? Timings, voltages, etc.

I'm not going to lie, it almost sounds like the DRAM could have been overvolted OOTB with whatever profile was being used. Mix that with some less-than-ideal LLC settings or manual DRAM voltage offsets and you could be in dangerous territory. This is kind of the same reason why the DRAM on my SB-e tower has never seen more than 1.65v. Out of every component I could alter the voltages on, DRAM was the one I was the most careful with. Not to mention, at least in the past, that higher DRAM voltages could damage the IMC as well.

Either way, if it is a configuration or memory profile problem, I would be very careful about using another set of the same kind of memory. You might apply the same settings and roast those as well.

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, AirPods Max
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.5

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, AirPods Max
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.5

System Name	Step_Sis Rodeo
Processor	AMD R9 9900X @ PBO
Motherboard	Asus Strix X670E -F
Cooling	Thermalright FW PRO 360, 3x TL-H12-X28-S, 3x TL-P12-S
Memory	2x 16GB Lexar Ares @ 6400 30-36-36-68 1.55v
Video Card(s)	Zotac 4070 Ti Trinity OC @ 3045/1500
Storage	WD SN850 1TB, SN850X 2TB, 3x SN770 1TB
Display(s)	LG 50UP7100
Case	Asus ProArt PA602
Audio Device(s)	JBL Bar 700
Power Supply	Seasonic Vertex GX-1000, Monster HDP1800
Mouse	Logitech G502 Hero
Keyboard	Logitech G213
VR HMD	Oculus 3
Software	Yes
Benchmark Scores	Yes

All 4 Ram Sticks Suddenly have Errors, BSOD @ Startup

ir_cow

lexluthermiester

Aquinus

Resident Wat-man

TheoneandonlyMrK

lexluthermiester

TheoneandonlyMrK

Aquinus

Resident Wat-man

freeagent

Moderator

ir_cow

Aquinus

Resident Wat-man

ir_cow

Aquinus

Resident Wat-man

davidm71

lexluthermiester

LabRat 891

chrcoluk

RJARRRPCGP

A Computer Guy

Aquinus

Resident Wat-man

System Name	Metalia
Processor	AMD Ryzen 7 5800X3D
Motherboard	Asus TuF Gaming X570-PLUS
Cooling	ID Cooling 280mm AIO w/ Arctic P14s
Memory	2x32GB DDR4-3600
Video Card(s)	Sapphire Pulse RX 9070 XT
Storage	Optane P5801X 400GB, Samsung 990Pro 2TB
Display(s)	LG ‎32GS95UV 32" OLED 240/480hz 4K/1080P Dual Mode
Case	Geometric Future M8 Dharma
Audio Device(s)	Xonar Essence STX
Power Supply	Seasonic Focus GX-1000 Gold
Mouse	Attack Shark R3 Magnesium - White
Keyboard	Keychron K8 Pro - White - Tactile Brown Switch
Software	Windows 10 IoT Enterprise LTSC 2021

System Name	Main PC
Processor	13700k
Motherboard	Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling	Noctua NH-D15S
Memory	32 Gig 3200CL14
Video Card(s)	4080 RTX SUPER FE 16G
Storage	1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 4TB WD SA510, 2x 3TB WD Red, 1x 4TB WD Red
Display(s)	LG 27GL850
Case	Fractal Define R4
Audio Device(s)	Soundblaster AE-9
Power Supply	Antec HCG 750 Gold
Software	Windows 10 21H2 LTSC

System Name	KHR-1
Processor	Ryzen 9 5900X
Motherboard	ASRock B550 PG Velocita (UEFI-BIOS P3.40)
Memory	64 GB G.Skill RipJaws V F4-3200C16D-64GVK
Video Card(s)	Sparkle Titan Arc A770 16 GB
Storage	Samsung 990 Pro 1 TB NVMe SSD
Display(s)	Alienware AW3423DWF OLED-ASRock PG27Q15R2A (backup)
Case	Corsair 275R
Audio Device(s)	Technics SA-EX140 receiver with Polk VT60 speakers
Power Supply	eVGA Supernova G3 750W
Mouse	Logitech G Pro (Hero)
Software	Windows 11 Pro x64 24H2

System Name	Still not a thread ripper but pretty good.
Processor	Ryzen 9 7950x, Thermal Grizzly AM5 Offset Mounting Kit, Thermal Grizzly Extreme Paste
Motherboard	ASRock B650 LiveMixer (BIOS/UEFI version P3.08, AGESA 1.2.0.2)
Cooling	EK-Quantum Velocity, EK-Quantum Reflection PC-O11, D5 PWM, EK-CoolStream PE 360, XSPC TX360
Memory	V-Color DDR5 96GB (48GBx2) 6400MHz CL52 2Rx8 ECC Unbuffered DIMM 1.1v (TE548G64D852K) + JONSBO NF-1
Video Card(s)	XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage	Samsung 4TB 980 PRO, 2 x Optane 905p 1.5TB (striped), AMD Radeon RAMDisk
Display(s)	2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case	Lian Li PC-O11 Dynamic Black (original model)
Audio Device(s)	Corsair Commander Pro for Fans, RGB, & Temp Sensors (x4)
Power Supply	Corsair RM750x
Mouse	Logitech M575
Keyboard	Corsair Strafe RGB MK.2
Software	Windows 10 Professional (64bit)
Benchmark Scores	RIP Ryzen 9 5950x, ASRock X570 Taichi (v1.06), 128GB Micron DDR4-3200 ECC UDIMM (18ASF4G72AZ-3G2F1)