• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

All 4 Ram Sticks Suddenly have Errors, BSOD @ Startup

Are you sure about that? I have not seen one person complain about B-Die on AM4.. As an outsider looking in, Samsung were the only ones to get it right? :confused:
Yes I distantly remember Ryzen 1000 series launch and pretty much nothing work but certain QVL kits. All because of the ICs used. This was before B-Die think existed I think. It was patched in later BIOS updates and they only thing holding back now is the max freq. Plus its a old CPU by todays standards.
 
Can y'all please stop insulting OP's intelligence by telling them to update BIOS or tweak voltages? Four memory modules going bad AT THE SAME TIME is astronomically improbable and thus very obviously a hardware failure - nothing that they can do will fix that. And the testing they've done verifies the modules are now bad.

OP, you've done excellent troubleshooting (pretty much the same that I would in this situation), and based on that it definitely sounds like something in your system has killed these modules. The obvious culprit is the CPU's memory controller (IMC), but "why" is a question that is going to be difficult to answer - especially since you said you tested with a different set of modules on the same CPU and things worked. My guess would be that there's a defect in your CPU IMC's silicon, that only comes into play when 4 modules are present i.e. highest load on the IMC, and over time that high load has caused that defect to worsen to the point where eventually it became bad enough that the IMC "slipped" and put too much voltage or current through your modules. The alternative is that your CPU is fine, but a cosmic ray caused the IMC to misbehave and kill the modules (no this is not a joke).

Either way, the safest option now is to RMA the CPU, because unfortunately it can't be trusted.
Wow, thank you for that! Yes, I thought that could be the problem at first when I did my initial troubleshooting steps. I was happy to try out all of these BIOS settings so I wouldn't have to start RMA'ing components, but it sounds like contacting AMD is the proper thing to do at this point. Totally crazy about the cosmic ray thing, hopefully that's not what happened!
Once again, I have to agree with @Assimilator here, your testing indicates a CPU integrated memory controller problem. This was likely because of over-volting(which you may not have known about), but that is only my theory. Your CPU and RAM are likely both unrecoverable. RMA is your best option in each case.
 
Your CPU and RAM are likely both unrecoverable.
I wouldn't assume that the RAM is done for. The OP needs to rule the CPU out first. If the IMC took out the RAM, I wouldn't even expect the machine to even boot.
 
I agree with the last few posts.

But no one asked, 4x 16GB stick's,

Were they bought as a quad validated set, or were they two dual stick sets.

Because only a validated bought as a quad set should have been run at xmp settings since the timings will be different for two paired sticks then four.

I have experience with this, buy a quad set OR manually set OR auto set timings IMHO.

As I said I agree it's probably t late now but this is worth mentioning in case you didn't know for next time.
 
I wouldn't assume that the RAM is done for. The OP needs to rule the CPU out first. If the IMC took out the RAM, I wouldn't even expect the machine to even boot.
They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.

Were they bought as a quad validated set, or were they two dual stick sets.
That really shouldn't matter. I install mixed sets frequently with no ill effects.
 
They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.


That really shouldn't matter. I install mixed sets frequently with no ill effects.
That depends what timings were set(in xmp), if just the first four then it often doesn't indeed matter, some sets go further.

And some settings move upwards with four sticks instead of two.

I've Rma'd three sets of ram over the years and seen issues.

SPD tables can tell you , TrFc set to auto for example was required for 2X2x8GB Corsair Vengeance 3600 kit I had last for example.

Fine on two sticks, issues on four.

5### series Ryzen has good memory support in general these days, but it can't defy physical differences and more rows and pages to read write refresh equals longer total cycle times etc

If those settings are auto obv less of an issue, if xmp defines them, issue possible.
 
Last edited:
They tested the RAM in a seperate system and had errors, the RAM being bad is a fair conclusion. I wouldn't be willing to trust it.
I just re-read the thread and the OP has not tested the sticks in another system AFAICT and has only tested a different set of DRAM on the current machine.
I did test two sticks of different Corsair Vengeance Pro 8gb 3200hz in the proper slots earlier, and I did not get any errors.
This doesn't give me much confidence that it's the CPU or the RAM that's at fault. It could still be something as simple as timings being off for one reason or another. I still find it incredibly unlikely that all the DIMMs failed unless they've been running super hot or over-volted for extended periods of time.

This might be a really dumb suggestion, but I'm going to ask anyways. Since all of this started occurring, have you tried resetting the BIOS to all stock settings and testing again? If it's stable at stock, I would gradually start changing settings and doing a memory test after each change that could impact the memory.
BSOD - Stop Code: IRQL_NOT_LESS_OR_EQUAL
I've seen this occur when CPU overclocks are unstable, not only when memory is unstable.
 
It is pretty hard to kill a CPU, you have to really go out of your way and try. I don't think people understand just how tough they really are..

I am not saying it cannot happen.. but I do not think it is a CPU problem. Heck, I have seen a Molex grounding out cause all kind of stability issues. It was to the point I thought my 5900X really was on its way out. Nothing was stable, no settings. Everything thing would fail, with good voltage showing in hwinfo..

It could be many things.. at this point he has to do some digging.
 
I did test those same sticks on my friend’s system and I was seeing errors on all 4 sticks. I also tested a spare set of ram on my rig and there was no errors and Windows booted without any issues.
Here is key information people missed...

No point in RMAing RAM if isnt the problem. However, what you are saying it doesn't work on another computer and another set does.
 
Here is key information people missed...
I'm not sure how I missed that. That would seem to indicate that the RAM is bad. I just have a hard time believing that all 4 sticks are faulty. The odds of that are incredibly small unless the OP just went YOLO and cranked the DRAM voltage to high heavens and roasted the DIMMs.
 
I'm not sure how I missed that. That would seem to indicate that the RAM is bad. I just have a hard time believing that all 4 sticks are faulty. The odds of that are incredibly small unless the OP just went YOLO and cranked the DRAM voltage to high heavens and roasted the DIMMs.
oh I agree. Never had 4 bad sticks on there own.
 
oh I agree. Never had 4 bad sticks on there own.
I'd even go further and say that I've never lost more than a single DIMM at once. I've never had a pair fail together.

I still think the OP should try resetting the BIOS to completely stock settings and try again without any additional changes. If it's stable at stock, it's entirely possible something else is going on.
 
Hello, I am having an issue where all 4 of my 16GB DDR4 3200mhz sticks now have errors when running memtest and computer fails to load without crashing quite often when I reset. I verified the errors on all sticks by testing them on another working PC. I am at a loss of how this could have happened, and when I first purchased these sticks a few months ago (Corsair Vengeance Pro), I did run memtest, and received zero errors with 4 passes. I am worried that if I get new modules, that they can become damaged as well, perhaps due to a faulty motherboard or maybe CPU. I do live in Florida, and I do have a whole home surge protector and I did not notice any other symptoms of a surge, but that is a possibility.

Any ideas how to fix this? Any way to diagnose the motherboard or CPU memory controller? Thanks for reading!



Some more details-

- Computer is stable when gaming, and when using high memory usage programs like DCS & Flight Sim 2020

- Computer only has issues when restarting where it would crash as soon as it starts to load Windows - BSOD - Stop Code: IRQL_NOT_LESS_OR_EQUAL What Failed: ntoskrnl.exe

- When running Ram in stock mode (no overclock), it will load further into Windows, and after a couple of crashes, it does load, then is stable

- When running Ram in XMP (or whatever AMD calls it), I have been able to successfully load windows and run said programs as above by removing all ram, and reseating it.... don't know if that's just a placebo

- Memory errors from memtest do occur no matter if ram is overclocked or not.

- Have tried each individual stick in all slots. Errors are consistent throughout





Specs-

4x 16gb 3200mhz DDR4

5800x3D

Asus B550-F Mobo

Nvidia 4090

Super Flower Leadex III Gold 850

Dump File- https://www.mediafire.com/file/9wfi34p5cy66r10/dumps.zip/file
Have you tried increasing the Vsoc voltage? Or try dropping the tCl settings to a lower rung? Lax your secondary timings?

Have a system that suddenly one day few months back became totally unstable. Crashed often. Had memtest errors. Had to lax my cas timings from 15 to 16. Only way.

My memory controller on my 9900k just couldn't handle it anymore. Maybe same thing.
 
Here is key information people missed...
Ninja'd.. :laugh:

It is pretty hard to kill a CPU, you have to really go out of your way and try.
This is true. The thing is, with the early generations of the Ryzen line, motherboard makers were setting voltages WAY too high. I've seen a few burned out Ryzen chips and every single time, it was the voltage that was out of whack. Of course it's only been once or twice a year, so yeah it's very uncommon but it does happen.

oh I agree. Never had 4 bad sticks on there own.
I saw this happen once with a CPU from a socket 1366 board, but again, it was the voltages that killed the CPU & RAM. Too much voltage will kill things.
 
Last edited:
I've had a similar scenario happen to me w/ Crucial Ballistix DDR2 (Vista era).
Issue there was RAM being binned wrong: the chips used were not as voltage/heat-tolerant as needed for the marketed voltage.
-ended up going thru 2 sets, both ran @ 'advertised stock'.
Note: Some motherboards run DIMM and/or IOD voltage a lil high, too.

As Assimilator mentioned, the CPU/IMC is suspect.
However IMO, all 4 'going bad' would/could implicate the motherboard. (ex. bad caps or malfunctioning RAM power phases.)

Also, any lightning strikes or severe power surges recently?
Years ago I've had a nearby lighting strike decimate multiple USB-connected USrobotics 56K modems.
Through my 'retail tech work' I'd also seen more than a couple PCs 'ded' after lightning strikes. (Not just the PSU, either)



All that said (and related to my last commentary), some mobos will 'load compensate' voltages on the CPU/SoC's IMC too.
(which, AFAIK too high of IOD/IMC voltage has been implicated in both damaged RAM and damaged IMCs)

edit:
from your testing with the RAM in another PC, and 'other RAM' working fine:
I'd guess the sticks 'degraded' under voltage/thermal conditions (possibly related to mobo LLCing high), or they were killed in a single 'freak' incident (lightning, etc.).
 
Last edited:
To me this points to a MB or IMC issue based on all 4 sticks tested bad. If its having issues at out of the box settings, so bios reset, no XMP, no voltage adjustments etc. I would be looking to prove either the CPU or the board, if neither can be proven due to lack of spare parts I would go with RMA'ing the CPU. The RMA team will probably then test the IMC on your behalf and confirm if its the issue, if it is you get a replacement, if it isnt then RMA the board.
 
Looks like the CPU has wonky stability issues. The symptoms remind me of what I would expect with bad CO settings.
 
Looks like the CPU has wonky stability issues. The symptoms remind me of what I would expect with bad CO settings.
The OP did mention that they had -30 CO on the CPU. I am curious though on the other system where the ram also didn't work if it was also the same scenario with CO?
 
I'd guess the sticks 'degraded' under voltage/thermal conditions (possibly related to mobo LLCing high), or they were killed in a single 'freak' incident (lightning, etc.).
LLC is an interesting one. Depending on how you tune it, voltage changes could spike. Mix that with shoddy DRAM voltage control and you might be exposing DRAM to high voltages.

After sleeping on this, I'm starting to think that this is a configuration problem that resulted in some dead DIMMs. Be it purposefully or not, but I guess that the DRAM profile for the RAM might be translating to too much voltage.
- When running Ram in XMP (or whatever AMD calls it), I have been able to successfully load windows and run said programs as above by removing all ram, and reseating it.... don't know if that's just a placebo
Looking for clarity on this point. Have you been running the default XMP/EXPO profile this entire time? If you have, could you tell us what settings it is applying for the DRAM? Timings, voltages, etc.

I'm not going to lie, it almost sounds like the DRAM could have been overvolted OOTB with whatever profile was being used. Mix that with some less-than-ideal LLC settings or manual DRAM voltage offsets and you could be in dangerous territory. This is kind of the same reason why the DRAM on my SB-e tower has never seen more than 1.65v. Out of every component I could alter the voltages on, DRAM was the one I was the most careful with. Not to mention, at least in the past, that higher DRAM voltages could damage the IMC as well.

Either way, if it is a configuration or memory profile problem, I would be very careful about using another set of the same kind of memory. You might apply the same settings and roast those as well.
 
Back
Top