Discussion in 'Linux / BSD / Mac OS X' started by biffzinker, Aug 25, 2017.
I have an affected CPU, but simply raising the SOC voltage to 1.2v fixes the issue. I think it was an early binning issue honestly. It's not worth sending it in for me.
I have a gentoo environment to test this in, so I can say it with some confidence.
yeah you have to disable micro op cache... its only on the first batch of them.
@R-T-B I didn't know that would fix that... I feel like 1.2v SOC fixes a bunch of crap. But I did read to disable micro op.
I've read that but disabling mico-op on mine seemed like a work around, so I tried other things first. It seems the mico op cache is considered part of the SOC and is voltage starved, at least on my chip, as raising SOC voltage eliminated the symptoms.
I'm unsure if it will work longterm though... really frustrating and making me considering an RMA, but I don't want my system down, lol.
yeah I hear you man... that micro op cache was one of the core things Keller's team introduced to increase the IPC :/...
The price of early adoption
How do you guys test to see if your CPU is affected by the bug? Is it possible in a non-*nix environment?
Not really possible - the erratum is specific to a certain type of workload and kernel over an extended period.
Ah welcome to purchasing the low bin of a new design
The info to determine if your Ryzen is affected is above. Yes, mine is. It is exceedingly unlikely you will see this issue outside a *nix OS however.
No, it's very testable in source distros like gentoo. I segfaulted pretty much every world rebuild.
It's a major PITA to set up a test environment though...
If you click on the source link I included at the bottom of the first post, and scroll down the news article there is a script linked to you can run under Linux terminal to test your Ryzen CPU.
I'll make it easy for you though here's the link.
RMA has resulted in segfault-free replacements as reported on the AMD Community thread. Why wait months or years to find out, when your CPU failure could be blamed on other things.
Also, how safe is 1.2v on the SOC voltage and how far above spec is it?
it's not far above sped at all. about .15v above spec. It also adds almost no heat.
As for the "why wait?" That answer is simple. Downtime. I don't want my rig down.
I may see if they are willing to take a credit card on hold for a "ship after receipt" exchange.
I must withdraw my former claims that the voltage fixed my issue.
Today, I did a world rebuild and got numerous segfaults. It appears it was just hiding, if improved at all. The issue is still present.
I will be buying a week 25 or newer processor in town tomorrow to minimize downtime, and RMAing this one. The RMA'd replacement will likely come for sale here soon. It should be a NIB 1800x guys, and I plan to slice at least $100 off retail! Watch for it!
I can't find one in town. It seems week 25+ CPUs haven't quite made it to market yet. I'll be doing an RMA I guess unless GIGABYTE miraculously comes out with a microcode update that fixes it in a few days.
I have my RMA'd CPU. Mine is a week 33. And it clocks like a rocket.
Expect a full report on the process and my findings soon.
because it keeps crashing? lolol /tease
I know you're kidding... but in a source based distro like gentoo, frankly, yes.
Oh I understand! im a BSD server admin myself. atleast all of my web stuff is. I consider it a test a baseline test for my techs to install gentoo with GUI. maybe ill make that a stipulation for there next raise. (and get java working)
Separate names with a comma.