• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Unable to get system stable when ram is at 3600mhz

Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Hey all! I haven't been around here in awhile but I'm quite frustrated with getting my ram to work as intended so here I am. The story is that I bought a BLD from NZXT and it came with some pretty great ram, but it was a 4x8gb configuration. Upon researching, I drew the conclusion that I should not have done that and gone with 2x16 instead, so I sold that ram and got 2x16. I think now that I was wrong in my interpretation but whatever. On to my current dilemma: my system is karhu stable using XMP settings, however the system is not completely stable and the system reboots randomly either at load or idle and logs a WHEA 18 error. My thinking based on research is this is IF instability related but I'm not completely sure. I did also use the settings below from Ryzen Master and I didn't get a WHEA error but I came back to my computer at one point and it was off. I'm not sure what the indicates but it does worry me a bit. I turned it back on and it booted right up with same settings applied in bios etc. As of today I have updated to the latest beta bios for my motherboard to see what effect that has but I'm not super optimistic. Do you guys have any advice on what to try here? I'm not really looking to do anything hardcore or extreme and really just want my ram to work at 3600. I do plan on getting the 5950 when it drops but that could be a matter of days or weeks depending on how it plays out. HELP!

TL;DR what can i do to get my ram to work at 3600mhz?

My specs are as follows:

MSI gaming edge wifi x570
3950x
32gb ddr4 3600 ram CMW32GX4M2Z3600C18
3080FE

rams.png
 
Joined
Sep 3, 2019
Messages
2,965 (1.78/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 150W PPT limit, 79C temp limit, CO -9~14
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F37h, AGESA V2 1.2.0.B
Cooling Arctic Liquid Freezer II 420mm Rev7 with off center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MHz 1.42V CL16-16-16-16-32-48 1T, tRFC:288, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~465W (366W current) PowerLimit, 1060mV, Adrenalin v24.2.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR1000
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v23H2, OSB 22631.3155)
Hi there!
How is your current settings?

Instead or writing them down, if you like and have the time, post screenshots of the following

1603819649322.png 1603819684306.png 1603819715368.png

It will give people here all the info they need to help you.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Thanks for replying man! Here are some screenshots. The HWINFO one may be a bit overboard but i couldn't figure out how to condense it quickly. At the moment I am using Ryzen Master safe settings on the latest beta bios as I saw it mentioned some OC fixes and some memory stability improvements. I'm open to any suggestions though and again, thanks for replying!

zen.PNG
hwinfo.PNG


tb.PNG
 
Joined
Sep 3, 2019
Messages
2,965 (1.78/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 150W PPT limit, 79C temp limit, CO -9~14
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F37h, AGESA V2 1.2.0.B
Cooling Arctic Liquid Freezer II 420mm Rev7 with off center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MHz 1.42V CL16-16-16-16-32-48 1T, tRFC:288, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~465W (366W current) PowerLimit, 1060mV, Adrenalin v24.2.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR1000
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v23H2, OSB 22631.3155)
For strarters I would suggest to Enable the "Gear Down Mode" in BIOS.
The one that shows as "GDM: Disabled" on ZenTimings.

This is free +stability with the expence of a little latency. But it may give you headroom for higher speed and/or tighter timings

Is your DRAM voltage around 1.4~1.42V as HWiNFO shows for DIMM sensor?
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Thanks! When I last went into the BIOS, I noticed that gear down mode was greyed out. The option was there and set to auto I believe but I couldn't select/change it. Any idea why that might be? And yes, I have set the voltage to 1.4 as per ryzen calculator recommendations.
 
Joined
Sep 3, 2019
Messages
2,965 (1.78/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 150W PPT limit, 79C temp limit, CO -9~14
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F37h, AGESA V2 1.2.0.B
Cooling Arctic Liquid Freezer II 420mm Rev7 with off center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MHz 1.42V CL16-16-16-16-32-48 1T, tRFC:288, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~465W (366W current) PowerLimit, 1060mV, Adrenalin v24.2.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR1000
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v23H2, OSB 22631.3155)
Thanks! When I last went into the BIOS, I noticed that gear down mode was greyed out. The option was there and set to auto I believe but I couldn't select/change it. Any idea why that might be? And yes, I have set the voltage to 1.4 as per ryzen calculator recommendations.
Did you manually set Command Rate to 1T? If yes then set Cmd to auto and then see if GDM will light up.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Did you manually set Command Rate to 1T? If yes then set Cmd to auto and then see if GDM will light up.

Yep, that must be it. I'll check that out if the system restarts again. Do you think exploring GDM is necessary if the system passes several 1000% karhu tests? The memory itself, at least in my non-expert opinion seems perfectly fine and I believe I've used both xmp profiles previously and passed karhu > 6000% but ultimately the restart/WHEA 18 error persisted in those cases.
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
Shouldn't be GDM issue. Best to have it on but if the ICs can't sustain GDM off, it'll be the first to show up in Karhu or HCI. Heck, instability due to GDM off will consistently show even in the super short built-in HCI test contained in Ryzen DRAM Calculator's membench. If it can past extended testing in Karhu or HCI, the RAM is most definitely not the problem.

Event 18 has me a little suspicious. If the Event Viewer details look like this:

3733 interconnect error.png

Then it's an IF issue. That's what I get and the exact behaviour I get when running 3733 and 3800. The memory is stable, but even at SoC over 1.10V, the Infinity Fabric occasionally throws a hissy fit and reboots, crashes, BSODs or does some other wack shit. So far, I've not found any way to eliminate these problems above 3600.

Regardless, that VSoC needs to come up to 1.10V asap to rule it out as a possible issue. 1800MHz/3600MT/s may be doable at 1.05V or less, but it's hardly guaranteed. Get SoC to 1.1V (just set that in the BIOS, it'll droop a bit in actual reported V in Windows but that's fine), then spend a few days using your comp.

Worst part about IF errors is that they can't be caught in testing. They show up days, weeks down the line when you least expect it.
 
Last edited:
Joined
Sep 3, 2019
Messages
2,965 (1.78/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 150W PPT limit, 79C temp limit, CO -9~14
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F37h, AGESA V2 1.2.0.B
Cooling Arctic Liquid Freezer II 420mm Rev7 with off center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MHz 1.42V CL16-16-16-16-32-48 1T, tRFC:288, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~465W (366W current) PowerLimit, 1060mV, Adrenalin v24.2.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR1000
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v23H2, OSB 22631.3155)
I would agree with the above. Set 1.1V for VSOC and see how it will go. DRAMcalculator build-in membench is really nice fast way to detect errors (2~3min). If it passed that then you are really close to 99~100% stability.

Right now ZenTimings sees a ~1.03V for VSOC. SVI2 TFN sensor is closest to the real thing that you can get. Same sensor is on HWiNFO: "SoC voltage (SVI2 TFN)".
It's right if you've set it manually to 1.05V like calculator suggested to you. All board SoC VRM usually output less than the actual setting. So if you set it to 1.1V you will see a value of 1.05~1.08V
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Thanks ya'll! I actually can't recall whether I did minimum or recommended VSOC this go around but I will absolutely check that if it happens again. From there, if it continues then I can assume that means this 3950x can't do 1800 IF? In that case, would I be better off lowering my ram to 3200 and tightening timings?
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
Thanks ya'll! I actually can't recall whether I did minimum or recommended VSOC this go around but I will absolutely check that if it happens again. From there, if it continues then I can assume that means this 3950x can't do 1800 IF? In that case, would I be better off lowering my ram to 3200 and tightening timings?

You could do that, but I would be extremely surprised if a 3950X couldn't do 1800MHz IF, especially one manufactured recently. For this very reason, on Asus and Gigabyte boards, setting 3600MT/s XMP RAM automatically dictates 1.10V VSoC as an auto-rule. I don't know why it's not the case for MSI.

As for 3200 and tightening timings, ICs don't exactly work that way. Keeping tCL the same and scaling frequency UP is always easier than tightening tCL below what it's rated for. Hence why 3200 14-14-14 B-die kits exist.

Once you get into higher MT/s speeds, you'll find that DRAM calculator's recommendations on various voltages can be really......broad and optimistic. When in doubt, set VSoC at 1.1V and cut it out of the equation.

Do take a look at your Event Viewer and try to find details for the exact event at the time of the crash. WHEA can report a lot of hardware errors and CPU/Interconnect is just one of many possible CPU errors (e.g. unstable CPU overclock due to insufficient voltage, the dreaded cache errors, etc.).
 
Last edited:
Joined
Feb 23, 2019
Messages
5,581 (3.00/day)
Location
Poland
Processor Ryzen 7 5800X3D
Motherboard Gigabyte X570 Aorus Elite
Cooling Thermalright Phantom Spirit 120 SE
Memory 2x16 GB Crucial Ballistix 3600 CL16 Rev E @ 3800 CL16
Video Card(s) RTX3080 Ti FE
Storage SX8200 Pro 1 TB, Plextor M6Pro 256 GB, WD Blue 2TB
Display(s) LG 34GN850P-B
Case SilverStone Primera PM01 RGB
Audio Device(s) SoundBlaster G6 | Fidelio X2 | Sennheiser 6XX
Power Supply SeaSonic Focus Plus Gold 750W
Mouse Endgame Gear XM1R
Keyboard Wooting Two HE
You could try and use something similar to mine (they're for single rank, not dual but still):

DRAM volt 1.4 V, SOC Voltage 1.1 V, VDDP 0.9 V, VDDG 1.05 V. GDM ON, PDM OFF
1603831534936.png
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
Use mine (they're for single rank, not dual but still):

DRAM volt 1.4 V, SOC Voltage 1.1 V, VDDP 0.9 V, VDDG 1.05 V. GDM ON, PDM OFF, ProcODT 40 Ohm.

Some caution with that one. Single rank procODT doesn't usually carry over to dual rank procODT; dual rank usually wants higher, in the 50s-60. But yeah, otherwise makes sense. OP says he's done lots of Karhu so the memory appears okay, might be insufficient VSoC or an I/O to chiplet IF issue.

@travva BZ talks about the difference in stability between the three key players (RAM, memory controller, IF) at the timestamp 16:40 on this link and why IC stability doesn't mean IF stability:
 
Joined
Feb 23, 2019
Messages
5,581 (3.00/day)
Location
Poland
Processor Ryzen 7 5800X3D
Motherboard Gigabyte X570 Aorus Elite
Cooling Thermalright Phantom Spirit 120 SE
Memory 2x16 GB Crucial Ballistix 3600 CL16 Rev E @ 3800 CL16
Video Card(s) RTX3080 Ti FE
Storage SX8200 Pro 1 TB, Plextor M6Pro 256 GB, WD Blue 2TB
Display(s) LG 34GN850P-B
Case SilverStone Primera PM01 RGB
Audio Device(s) SoundBlaster G6 | Fidelio X2 | Sennheiser 6XX
Power Supply SeaSonic Focus Plus Gold 750W
Mouse Endgame Gear XM1R
Keyboard Wooting Two HE
Some caution with that one. Single rank procODT doesn't usually carry over to dual rank procODT; dual rank usually wants higher, in the 50s-60.
I'm no expert by any means so all input is welcome.

OP:
Recommended read.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
You could do that, but I would be extremely surprised if a 3950X couldn't do 1800MHz IF, especially one manufactured recently. For this very reason, on Asus and Gigabyte boards, setting 3600MT/s XMP RAM automatically dictates 1.10V VSoC as an auto-rule. I don't know why it's not the case for MSI.

As for 3200 and tightening timings, ICs don't exactly work that way. Keeping tCL the same and scaling frequency UP is always easier than tightening tCL below what it's rated for. Hence why 3200 14-14-14 B-die kits exist.

Once you get into higher MT/s speeds, you'll find that DRAM calculator's recommendations on various voltages can be really......broad and optimistic. When in doubt, set VSoC at 1.1V and cut it out of the equation.

Do take a look at your Event Viewer and try to find details for the exact event at the time of the crash. WHEA can report a lot of hardware errors and CPU/Interconnect is just one of many possible CPU errors (e.g. unstable CPU overclock due to insufficient voltage, the dreaded cache errors, etc.).

Thank you! I didn't explicitly say so but that is the error in my event viewer, whea 18, but mine shows cache hierarchy error. Any insight into what that can mean vs bus interconnect?

whea.PNG
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
Thank you! I didn't explicitly say so but that is the error in my event viewer, whea 18, but mine shows cache hierarchy error. Any insight into what that can mean vs bus interconnect?

Yikes, uhh not what I was expecting. That's one of the "other" CPU errors I was alluding to earlier. CPU-side overclocking is not exactly my forte, and from everywhere that I've seen other Ryzen 3000 owners complaining about the cache hierarchy WHEA error they've all ended up RMAing their CPUs.

The caches are located on the chiplet dies with the cores. So aside from suggesting you look into the usual CPU-side overclocking culprits (e.g. not enough core voltage), not much I can tell you there if you're running stock settings on the CPU side.

Are all of the WHEA 18 instances that cache hierarchy error, or are they all over the place? Might need to get on a more up-to-date BIOS if one is available. Could be that MSI's firmware isn't providing enough idle voltage to sustain light load boosting on one core.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
I'm no expert by any means so all input is welcome.

OP:
Recommended read.

What specifically am I looking for here? I just see that thread is super long. Do you just mean read through it entirely or?
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
Yikes, uhh not what I was expecting. That's one of the "other" CPU errors I was alluding to earlier. CPU-side overclocking is not exactly my forte, and from everywhere that I've seen other Ryzen 3000 owners complaining about the cache hierarchy WHEA error they've all ended up RMAing their CPUs.

The caches are located on the chiplet dies with the cores. So aside from suggesting you look into the usual CPU-side overclocking culprits (e.g. not enough core voltage), not much I can tell you there if you're running stock settings on the CPU side.

Are all of the WHEA 18 instances that cache hierarchy error, or are they all over the place?

Yes indeed, they are all that specific error. The only thing that changes is the APIC ID. I had read a thread on reddit where someone said RMA'ing their cpu solved the problem so I do wonder if that's the case for me. I am pretty set on getting a 5950x so if I end up selling this one then I'll make sure to mention this. It does make sense with the erratic nature of this error that maybe it does happen at stock and I just haven't been patient enough to observe it.

Ryzen 3950x and a RTX3080FE, what power supply are you using?

View attachment 173556

i'm using an NZXT c1000 PSU.
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
Yes indeed, they are all that specific error. The only thing that changes is the APIC ID. I had read a thread on reddit where someone said RMA'ing their cpu solved the problem so I do wonder if that's the case for me. I am pretty set on getting a 5950x so if I end up selling this one then I'll make sure to mention this. It does make sense with the erratic nature of this error that maybe it does happen at stock and I just haven't been patient enough to observe it.

I'd take a longer look around the web to see if you can find any other solutions to the cache hierarchy issue but yeah, my few hours of looking on the subject have only turned up RMA as the only solution on every thread. One guy made it go away by setting his 3900X to a low fixed clock of 3.8GHz (which makes sense given the context of the error), but at that point with the performance loss you'd be better served with warranty service anyways.

The BIOS update might be first worth a try, but otherwise an RMA is probably in order. As to the RAM, if it's stable in extensive Karhu/HCI/TM5, sfc/scannow and chkdsk don't turn up any errors or disk corruption, and you're not accumulating any interconnect errors, then no need to worry.
 
Joined
Mar 23, 2016
Messages
4,839 (1.65/day)
Processor Ryzen 9 5900X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory VENGEANCE LPX 2 x 16GB DDR4-3600 C18 OCed 3800
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500 GB, 870 QVO 1 TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v22H2
Yikes, uhh not what I was expecting. That's one of the "other" CPU errors I was alluding to earlier. CPU-side overclocking is not exactly my forte, and from everywhere that I've seen other Ryzen 3000 owners complaining about the cache hierarchy WHEA error they've all ended up RMAing their CPUs.

The caches are located on the chiplet dies with the cores. So aside from suggesting you look into the usual CPU-side overclocking culprits (e.g. not enough core voltage), not much I can tell you there if you're running stock settings on the CPU side.

Are all of the WHEA 18 instances that cache hierarchy error, or are they all over the place? Might need to get on a more up-to-date BIOS if one is available. Could be that MSI's firmware isn't providing enough idle voltage to sustain light load boosting on one core.
There was a newer bios revision that corrected that issue. I had it happening to me then it went away after the bios update.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
There was a newer bios revision that corrected that issue. I had it happening to me then it went away after the bios update.

That's good to hear! Was it the exact same error I posted, including the cache hierarchy? As I said I did just update to a beta bios for the board that's mostly geared toward zen3 but it did mention some OC/memory stuff.
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,457 (2.33/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃B550-I Strix
Cooling PA120+T30┃AXP120x67
Memory 64GB 6000CL30┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Display(s) 43" QN90B / 32" M32Q / 27" S2721DGF
Case Caselabs S3┃Lone Industries L5
Power Supply Corsair HX1000┃HDPlex
There was a newer bios revision that corrected that issue. I had it happening to me then it went away after the bios update.

I just missed the fact that he said he's on the latest beta BIOS. Not looking good there. As to other owners with the cache hierarchy error, BIOS updates didn't do much.

Actually, @travva the APIC ID refers to the core/thread reporting the error. So for your stock 3950X, there are 32 possible numbers that can show up there. If the core numbers are all over the place, suggest you find a BIOS to update/downgrade to and see if things improve. Seems to point to a firmware problem, possibly with idle voltages. Make sure your chipset drivers are also up to date; they go hand in hand with AGESA.

Vermeer is just over a week out; vendors will all be pushing out AGESA 1.1.0.0 BIOS updates. Fingers crossed.

Another thought, if it's crashing at idle, try and find the option for Low Current Idle in your BIOS and change it to Standard Current Idle. Not sure what it's called on MSI.
 
Last edited:
Joined
Mar 23, 2016
Messages
4,839 (1.65/day)
Processor Ryzen 9 5900X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory VENGEANCE LPX 2 x 16GB DDR4-3600 C18 OCed 3800
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500 GB, 870 QVO 1 TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v22H2
I remember getting WHEA errors, I’m not 100% if they were cache hierarchy. I’m pretty certain it was that though. I remember it being noted as fixed in a change log for a AGESA update.

Another thought, if it's crashing at idle, try and find the option for Low Current Idle in your BIOS and change it to Standard Current Idle. Not sure what it's called on MSI.
It’s the same naming for both options. There’s a drop down menu in CPU features if you move down to the bottom of the Advance tab.
 
Joined
Aug 25, 2008
Messages
367 (0.06/day)
Location
Hampton, VA USA
System Name pubg machine
Processor Ryzen 3950x
Motherboard MSI Gaming Edge Wifi x570
Cooling Kraken Z73
Memory 32GB Dominator RGB
Video Card(s) RTX 3080FE
Storage Firecuda 520 2TB
Display(s) Odyssey G7
Case h710i
Audio Device(s) Arctis pro wireless
Power Supply NZXT c1000
Mouse Logitech g pro wireless
Keyboard Logitech g915 tkl
Software Windows 10 x64 Pro
Benchmark Scores pwning n00bs like you
I just missed the fact that he said he's on the latest beta BIOS. Not looking good there. As to other owners with the cache hierarchy error, BIOS updates didn't do much.

Actually, @travva the APIC ID refers to the core/thread reporting the error. So for your stock 3950X, there are 32 possible numbers that can show up there. If the core numbers are all over the place, suggest you find a BIOS to update/downgrade to and see if things improve. Seems to point to a firmware problem, possibly with idle voltages. Make sure your chipset drivers are also up to date; they go hand in hand with AGESA.

Vermeer is just over a week out; vendors will all be pushing out AGESA 1.1.0.0 BIOS updates. Fingers crossed.

Another thought, if it's crashing at idle, try and find the option for Low Current Idle in your BIOS and change it to Standard Current Idle.

They are all over the place. At one point I thought the issue was because I was using Ryzen balanced plan but that turned out to not be the case. And yes this latest beta is AGESA 1.1.0.0 so maybe it will be the fix. Otherwise I will do what you said and try to downgrade the BIOS to see if that has any affect. When I got this computer, I did update the bios right away which maybe was a poor decision in hindsight.

I remember getting WHEA errors, I’m not 100% if they were cache hierarchy. I’m pretty certain it was that though. I remember it being noted as fixed in a change log for a AGESA update.


It’s the same naming for both options. There’s a drop down menu in CPU features if you move down to the bottom of the Advance tab.

Awesome! That is definitely not something I've explored yet so will check that out too. Thank you guys so much! I was complaining to my buddy earlier that I posted here for the first time in a decade after Reddit wasn't much help, and how I got no reply. I was just impatient as it turns out! :D

I remember getting WHEA errors, I’m not 100% if they were cache hierarchy. I’m pretty certain it was that though. I remember it being noted as fixed in a change log for a AGESA update.


It’s the same naming for both options. There’s a drop down menu in CPU features if you move down to the bottom of the Advance tab.

@biffzinker when you had the WHEA,. it wasn't the same board I have right? If not, was it MSI at least?

I just missed the fact that he said he's on the latest beta BIOS. Not looking good there. As to other owners with the cache hierarchy error, BIOS updates didn't do much.

Actually, @travva the APIC ID refers to the core/thread reporting the error. So for your stock 3950X, there are 32 possible numbers that can show up there. If the core numbers are all over the place, suggest you find a BIOS to update/downgrade to and see if things improve. Seems to point to a firmware problem, possibly with idle voltages. Make sure your chipset drivers are also up to date; they go hand in hand with AGESA.

Vermeer is just over a week out; vendors will all be pushing out AGESA 1.1.0.0 BIOS updates. Fingers crossed.

Another thought, if it's crashing at idle, try and find the option for Low Current Idle in your BIOS and change it to Standard Current Idle. Not sure what it's called on MSI.

@tabascosauz I just found the following which I believe is from an AMD employee on their forums:

amdmatt
Employee
Oct 26, 2020 8:00 AM
Have you tried a different, better quality PSU?

Look for an option in the BIOS called Power Supply Idle Control, try setting this to Typical.

Pretty sure this guys talking about the same thing you are! Not that I needed more convincing but that may the culprit.
 
Top