Guys, the reported issue here is clearly a memory corruption issue, so the GPU, PSU, CPU cooler, drivers etc. isn't the issue.
The most common culprits for such problems are;
- Overclocked CPU (including running power limits above spec)
- Overclocked memory
- Defective memory (I believe you may have eliminated this one, but it's at least fairly easy with a proper MemTest86/MemTest86+)
- Degraded/defective IMC
- Corrosion on contacts (highly unlikely on a 1 year old PC)
Ram: Corsair Vengeance RGB 2x16GB DDR5 5200MHz
Can you please give me the
precise SKU, as there are several variants of this product.
What do you mean by degraded? how can it happen?
All microchips degrades over time, that's why they all eventually stop working as intended. Degradation may come from productions issues, but is also accelerated by voltage, heat, and prolonged load. Degradation is why someone can achieve a "stable" OC today, but the PC crashes a few months from now.
Right... For these reasons, I always say , buy a CPU with iGPU....
Who doesn't have an old GPU lying around?
The Windows memory test you ran has two modes: quick and extended. If you already did the extended test, you're good.
The Windows memory test isn't very extensive. I'd recommend running either MemTest86 or MemTest86+ for a few hours. It's usually not required to run it for days, but any such utility will still not catch all timing issues.
For the SSD, there's SMART. Install the software from the manufacturer, it will tell you about drive health, number of errors (a small number is nothing to worry about) and general drive health.
Even though the SSD isn't causing the corruption of the memory heap (so any issue found here is a separate unrelated problem), I want to add that
any SMART
error should be taken with concern, as it's a sign of a pre-failing drive. SSDs are prone to becoming unstable long before their "endurance" is depleted. Running a diagnostic/SMART-test on a drive yearly is smart, probably twice a year for your OS drive.