• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Zen 3 Machine Check Exception Cache hierarchy reboots are back...

Joined
Feb 6, 2021
Messages
3,121 (2.00/day)
Location
Germany
Processor AMD Ryzen 9 9950X3D
Motherboard ASRock B850M PRO-A
Cooling Corsair Nautilus 360 RS
Memory 2x32GB Kingston Fury Beast 6000 CL30
Video Card(s) PowerColor Hellhound RX 9070 XT
Storage 1TB Samsung 990 Pro, 2TB Samsung 990 Pro, 4TB Samsung 990 Pro
Display(s) LG 27GS95QE-B, MSI G272QPF E2
Case Lian Li DAN Case A3 Black Wood Edition
Audio Device(s) Bose Companion Series 2 III, Sennheiser GSP600 and HD599 SE - Creative Soundblaster X4
Power Supply Corsair RM1000X ATX 3.1
Mouse Razer Deathadder V3
Keyboard Razer Black Widow V3 TKL
VR HMD Oculus Rift S
:
Bios Up to Date
Tested on Two B550 Strix Boards
Two 5900X, One 5800X (Stock...)
Tested with 5 GPUs
2 PSUs
All existing Bios Versions.
Several clean installations of Windows 10.
Swapped SSD
Swapped RAM
Ran every existing stresstest in the last 6 Weeks.
Disabled Core Boost, enabled Typical Current load for the PSU, disabled C States. tried everything every forum ever had left as some secret fixes etc.

I have three Zen 3 Chips and every single one has random black screens, Reboot, Back to Desktop.
Eventviewer repeats the same cache hierarchy error every time.

since around two weeks i get random idle/low load reboots with my second 5900X.
(i played around 35 Hours different Games with no issues)
now i play wow (like 5% CPU Load)
browse the internet, watch a YT Video etc. Random Black Screens, Reboot, Welcome! Here is another cache hierarchy error.

The Problem is not reproduceable and appears sometimes not for 48 Hours and sometimes 10 times a day.

i swapped my 3080 for a 6900XT and the issue got a bit worse (could be random and not related since it is so rare in the first place)

i ordered a Z590 Bundle from a Local Store (Z590 Strix F and a 10900k for 699€... actually a steal imo)


i just found this while writing...
AMD Ryzen 5000 'Zen 3' Desktop CPUs & X570 Motherboards Have High Failure Rates (wccftech.com)


Any other idea? can it be related to something else even if it is a hardware error that is strictly bound to the CPU? i never had it on any Intel Platform or Zen 2 CPU... Only with Zen 3 and a 100% Failure Rate out of three Chips.
 
Very likely that AMD will have you RMA the CPU.
 
Very likely that AMD will have you RMA the CPU.
that would be the 4th time in a row...

every single chip had the same problem within a few weeks.
 
that would be the 4th time in a row...

every single chip had the same problem within a few weeks.
Uh oh! What about the motherboard?
 
Sounds like other issues. Like MOBO bad VRM circuts, or bad PSU, or bad memory.

Or try another board vendor. Or you just have shit luck.
 
Sounds like other issues. Like MOBO bad VRM circuts, or bad PSU, or bad memory.

Or try another board vendor. Or you just have shit luck.
did you've read my post? :)
 
Disable C states in the Bios.
 
:
Bios Up to Date
Tested on Two B550 Strix Boards
Two 5900X, One 5800X (Stock...)
Tested with 5 GPUs
2 PSUs
All existing Bios Versions.
Several clean installations of Windows 10.
Swapped SSD
Swapped RAM
Ran every existing stresstest in the last 6 Weeks.
Disabled Core Boost, enabled Typical Current load for the PSU, disabled C States. tried everything every forum ever had left as some secret fixes etc.

I have three Zen 3 Chips and every single one has random black screens, Reboot, Back to Desktop.
Eventviewer repeats the same cache hierarchy error every time.

since around two weeks i get random idle/low load reboots with my second 5900X.
(i played around 35 Hours different Games with no issues)
now i play wow (like 5% CPU Load)
browse the internet, watch a YT Video etc. Random Black Screens, Reboot, Welcome! Here is another cache hierarchy error.

The Problem is not reproduceable and appears sometimes not for 48 Hours and sometimes 10 times a day.

i swapped my 3080 for a 6900XT and the issue got a bit worse (could be random and not related since it is so rare in the first place)

i ordered a Z590 Bundle from a Local Store (Z590 Strix F and a 10900k for 699€... actually a steal imo)


i just found this while writing...
AMD Ryzen 5000 'Zen 3' Desktop CPUs & X570 Motherboards Have High Failure Rates (wccftech.com)


Any other idea? can it be related to something else even if it is a hardware error that is strictly bound to the CPU? i never had it on any Intel Platform or Zen 2 CPU... Only with Zen 3 and a 100% Failure Rate out of three Chips.

Damn, that's friggin unlucky. The lower QC is debatable according to some but really isn't so hard to see if you look at the resurgence of Cache Hierarchy and Bus/Interconnect over the past few months, especially on earlier AGESAs.

Anyhow, if you've already disabled Global C-states, there's not much more to try. Could try an older/newer BIOS with different AGESA as sometimes a certain release has wonky idle voltage characteristics. Could try a positive Vcore offset, but whether that remains safe on your chip and board I couldn't tell you. Could also sacrifice some boost clock to run a lower all-core (4.3-4.5) at safe voltages, but depending on what you're after, you might consider it not getting your money's worth out of the product.

Another option to try is to disable Low Current Idle by setting it to Typical instead, but that's usually not got much to do with Cache Hierarchy of all things.

But aside from that, there's really not much you can do. Cache Hierarchy is not new; it's been pretty infamous for Zen 2 owners too. If the above options don't work, the only proven solution is to RMA.
 
Last edited:
I guess you BIOS version is 1805, AGESA 1.2.0.0? Did you try older version like v1216, AGESA 1.1.0.0 D?
Did you run HWiNFO sensor mode all times and if yes, what version?
 

While I omitted your issue because I don't even know what a "cache hierarchy" is, I will say that I read that "article" you linked and if I'm being completely honest, it reeks of complete BS. Not to mention I've never even heard of "PowerGPU" and just from reading the comments, they sound shady as fuck anyway. Now, if a trusted site like Anandtech, TPU, or Techspot published an article about Zen 3/ B550 X570 motherboard failure rates, I'd put a great deal more stock into it because I know they would all have the evidence to back it up.
 
the first google result page (of over 6000) when typing in Cache hierachy error.

and since many people don't seem to read my post.

i tested every existing Bios
i tested a LOT of GPUs
TWO Powersupplies
Two SSDs
two Ram Kits
three Zen 3 CPUs
and all the Hardware on an Intel Platform (where everything works perfectly fine.)

Blank Windows 10 Installations with the latest Chipset driver

the issue appears around 3-4 weeks after the CPU first ran.
the first time it happens completely random and only once and then it gets worse and worse (i even had a reboot while writing this..)
the CPU get's RMAd, i get a new one and the CPU dies within a few weeks.


GPUPower is not a small Systembuilder and they have a quarter million followers on twitter.


Yuri (who makes the dram calculator and clock tuner) answered:
Processors that work but have poor FCLK overclocking capabilities or have incorrect CPPC tags relative to FIT and temperature are not counted.

my first 5800X was not even able to show a single core in Ryzen Master (the "Cores" list was empty)
a month later it died.

my 5900X worked perfectly fine for around a month until it wasnt even able to run cinebench (completely stock)

my replacement 5900X has non stop crashes and can not run P95 without rounding errors after around 5-10 minutes.
Random reboots, cache hierarchy errors and tons of system lockups.

Screenshot 2021-02-15 115355.png
 
I would stay with INTEL you have bad luck with the new 5 series from AMD. Maybe wait until the next refresh? Looks like you have wasted allot of hours and that last image you posted reminds me of a serial killer with the hundreds of pictures of the victims. Honestly what else can you do, besides inform others to not rip their hair out like you might have? Luck of the draw...and I would not advise on any gambling!
 
Only one model of motherboard? I would suspect that before CPU if you get three bad CPUs in a row.
 
I would stay with INTEL you have bad luck with the new 5 series from AMD. Maybe wait until the next refresh? Looks like you have wasted allot of hours and that last image you posted reminds me of a serial killer with the hundreds of pictures of the victims. Honestly what else can you do, besides inform others to not rip their hair out like you might have? Luck of the draw...and I would not advise on any gambling!
:D

yeah i don't know.. i am very very happy with Zen 3 but the non stop problems are so disappointing that i rather run a skylake+++++++ i3 than a 5950x

Only one model of motherboard? I would suspect that before CPU if you get three bad CPUs in a row.
the brand new CPUs work for like a month without any problems until it appears.
and i replaced the board with another one actually (a strix F and a strix E)
 
:D

yeah i don't know.. i am very very happy with Zen 3 but the non stop problems are so disappointing that i rather run a skylake+++++++ i3 than a 5950x


the brand new CPUs work for like a month without any problems until it appears.
and i replaced the board with another one actually (a strix F and a strix E)

Again, it is much more likely you get two bad motherboards than three bad CPUs in a row, maybe the boards are killing the CPUs. If you still want to sink time and money in it, try a different brand, maybe an X570 board even (i don't know if they have better vrms).

Are the tested PSUs new or stuff you had laying around?
 
I just built this new AMD system and had weird issue where my sound settings in the task bar would work sometimes. Thought maybe is was a overclock issue, widows issue, nope turned out the is a setting for the PCIE slots.. after I changed Video card to PCIE4 then the other to PCIE3 issue went away. I love the performance of the new system but have been spoiled by things that work correctly. They (AMD) have been out of the spot light sometime and feel they rushed this to give us what we have been expected from them for the past 6-8 years.
 
This motherboard model is causing the death of the CPUs. By overvolting probably due to a fault in its sensors. And yes, I saw that you tried 2 of the same. Nothing to do with that. A whole production stack can be faulty until they recall it and make a new revision. Has happened many times especially with the motherboards.
 
:D

yeah i don't know.. i am very very happy with Zen 3 but the non stop problems are so disappointing that i rather run a skylake+++++++ i3 than a 5950x


the brand new CPUs work for like a month without any problems until it appears.
and i replaced the board with another one actually (a strix F and a strix E)
Did you happen to have a HWiNFO sensors screenshot while running full load?
A window like this bellow where all CPU/Board info (full sensor name and cur/min/max/avg) are visible?

1613392767596.png

Also a ZenTimings one...

1613392805571.png
 
Can you dump all your BIOS settings here? it would be a good start.
 
Is this a known issue?
Nothing becomes known until someone finds out. ;)

Beyond the joking, this type of behaviour is classic for specifices bunch of a motherboard model that kill CPUs and many such have existed until now. And since PSU is changed, it is of very high possiblility that the board is causing the CPU death by excessive voltage.
 
Last edited:
I havnt experienced this with my 5950x on a crosshair VIII. Touch wood i dont in the future! Ive had this chip at least a couple of months now and its been pretty solid.

Good luck sorting out your issue!!!!
 
I havnt experienced this with my 5950x on a crosshair VIII. Touch wood i dont in the future! Ive had this chip at least a couple of months now and its been pretty solid.

Good luck sorting out your issue!!!!
What caused your issue?
 
every single chip had the same problem within a few weeks.

And you don't find anything odd about that at all ? You have to accept eventually that getting a bad CPU three times in a row is practically impossible.

There is another fault somewhere else, it's clear that your issue is degenerative and not the result of an outright defective CPU out of the box.

From what I understand you only used ASUS boards, that should draw your attention more than the CPUs. Perhaps something about them makes them kill the CPUs over time, like a defective VRM assembly or buggy BIOS messing up the voltages.
 
And you don't find anything odd about that at all ? You have to accept eventually that getting a bad CPU three times in a row is practically impossible.

There is another fault somewhere else, it's clear that your issue is degenerative and not the result of an outright defective CPU out of the box.

From what I understand you only used ASUS boards, that should draw your attention more than the CPUs. Perhaps something about them makes them kill the CPUs over time, like a defective VRM assembly or buggy BIOS messing up the voltages.

You make a good point. OP, if I were you, I'd try a board from GIGABYTE, a board from MSI and something like a Steel Legend board from ASRock (their lower-end stuff isn't all that great) if you can. If not, at least try one board from another manufacturer besides ASUS.
 
Back
Top