• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Confirms Ryzen Marginality Performance Issue Under Linux, TR and EPYC Clear

Let me reword that...their fanboys and paid shills/sponsors pretend it doesn't exist.

ah sorry maybe we got off on the wrong foot. Now I understand :toast:

I thought you were slipping stud reg date of 07 I just couldnt grasp that you may have lost your touch for facts like the new users have.
 
ah sorry maybe we got off on the wrong foot. Now I understand :toast:

I thought you were slipping stud reg date of 07 I just couldnt grasp that you may have lost your touch for facts like the new users have.

Yeah, sorry, I get testy, b/c I'm old enough to remember all of intel's lies and deceptions. My favorite is probably when they owned that benchmark tool back in the athlon 64 days...my brain is shot. Was it PCmark? They hid it, but very poorly. They got the incredibly crooked results published in college textbooks. It took me 5 mins to figure it out. The address of the benchmark company was the same as Intel's...busted.
 
You're completely ignoring Intel's history of being complete c****. They're the bug kings and pretend it doesn't exist. Publications begging for their money will go ham on AMD for lesser bugs or ones that do not matter. Remember the infamous phenom TLB bug? It literally affected zero people. It was remotely possible in a server environment, so AMD fixed it with an update and RMA'd the CPUs. It was so blown out of proportion, b/c it shouldn't even have been a thing in consumer land. Meanwhile, Intel has sata ports fail on every one of those chipsets made, atom CPUs degrading and they get a couple days of mild stories about it and no one cares.

There's a double standard here alright and it's not from me or to AMD's benefit.

So you admit you hammer Intel more, because Intel is evil...

Let me reword that...their fanboys and paid shills/beneficiaries pretend it doesn't exist.

Edited b/c I can't English properly, murrica.

... and then proceed to call others fanboys and paid shills.
 
I would say that Intel deserves the hate that they receive, just like how Microsoft, Dell, HP, Symantec, and a whole host of other companies deserve the hate that they get. They are world class companies, they should be handling things far better than they do. Instead they try to pass the buck/blame, pretend the issue doesn't exist, or other forms of trying to sweep the dirt under the rug.
 
I don't care who the company is, you bite the damn bullet and admit you did wrong. I am far more likely to trust a company that admits that they screwed up than a company that was found out years later.

There's a reason why I don't shop at Home Depot, they tried to cover up the credit card hack. Only after it was exposed that they came out and said "We're sorry". Well "sorry" ain't good enough!
 
I don't care who the company is, you bite the damn bullet and admit you did wrong. I am far more likely to trust a company that admits that they screwed up than a company that was found out years later.

There's a reason why I don't shop at Home Depot, they tried to cover up the credit card hack. Only after it was exposed that they came out and said "We're sorry". Well "sorry" ain't good enough!

This. I don't know why people think certain companies "care" about their customers.
It's a business. They are in it for the money regardless of who it is or what they say.
 
I understand, in the end it's all about business. What these companies need to understand is that when bad things happen the way they handle it can and will make people think differently. Fool me one, shame on me. Fool me twice... well, go f*** yourself.
 
Windows is affected. This bug has already been reproduced under WSL.
wait....it's still different...my basic knowledge of Linux tells me that WSL is just Linux running under Windows (10 to be specific. please correct me if I'm wrong)..so it's still inconclusive to say that Windows is affected.
 
I understand, in the end it's all about business. What these companies need to understand is that when bad things happen the way they handle it can and will make people think differently. Fool me one, shame on me. Fool me twice... well, go f*** yourself.
And when AMD sends their PR to tell the problem is no more, but don't provide a technical explanation of the issue (or even imply that they know what it is), what do you say about that? it that the right way to handle this?
 
And when AMD sends their PR to tell the problem is no more, but don't provide a technical explanation of the issue (or even imply that they know what it is), what do you say about that? it that the right way to handle this?
I tend to not trust what the public relations departments of companies say, after all, PR departments are known for spinning things in their favor. In the case of AMD, much like Intel, I would like to see a somewhat technical write-up on what the issue was and how it was corrected.

As for why this issue occurred on Linux and not on Windows, it could be that Linux (being that Linux tends to be more on the enthusiast front) was using some kind of instruction set in a weird way whereas Windows tends to be more conservative in terms of using newer processor instruction sets since Microsoft wants to make sure that Windows runs on just about anything including some old-ass Pentium 4 machine.
 
It appears more to be an ASLR bug from some reading. Linux uses this, Windows not so much.
 
I tend to not trust what the public relations departments of companies say, after all, PR departments are known for spinning things in their favor. In the case of AMD, much like Intel, I would like to see a somewhat technical write-up on what the issue was and how it was corrected.

Agreed, that's what I'm waiting for as well. With an explanation about what "performance marginality" is to go with it :D
 
wait....it's still different...my basic knowledge of Linux tells me that WSL is just Linux running under Windows (10 to be specific. please correct me if I'm wrong)..so it's still inconclusive to say that Windows is affected.
Linux is a kernel. WSL implements a Linux-compatible userland which makes Linux applications able to run on the Windows kernel. There is no Linux code in Windows.

It appears more to be an ASLR bug from some reading. Linux uses this, Windows not so much.
ASLR is a software implementation in the kernel. The problems have been reproduced with ASLR disabled, but the amount of occurrences might be slightly reduced, since ASLR increases the stress on the prefetcher.

The errors in the uOP cache is clearly a corruption happening inside the CPU core, micro operations are generated in the front-end/prefetcher, and since the hardware detects these there it's clearly a hardware bug.
 
Last edited:
Linux is a kernel. WSL implements a Linux-compatible userland which makes Linux applications able to run on the Windows kernel. There is no Linux code in Windows.


ASLR is a software implementation in the kernel. The problems have been reproduced with ASLR disabled, but the amount of occurrences might be slightly reduced, since ASLR increases the stress on the prefetcher.

The errors in the uOP cache is clearly a corruption happening inside the CPU core, micro operations are generated in the front-end/prefetcher, and since the hardware detects these there it's clearly a hardware bug.

This bug seems to be a silicon quality issue honestly.

I was terribly plagued by this and found that a final 1.2v SOC completely eliminated the bug in all forms and test suites for me.

Weird.
 
If it's a silicon quality issue then it kind of does make sense that both Threadripper and Epyc wouldn't have this issue, they both are made out of the higher quality silicon while us mere peasants buying the Ryzen CPUs would be stuck with the... less than desirable stuff.
 
This bug seems to be a silicon quality issue honestly.

I was terribly plagued by this and found that a final 1.2v SOC completely eliminated the bug in all forms and test suites for me.

Weird.
You're manipulating frequency or just the voltage?

And define "completely eliminated". This is not how electronics work. This problem has already manifested, so it is there - that's the only sure thing.
So you can't say that a problem has been eliminated just with stability tests. Lowering voltage might only make this less probable.

Now we need an explanation and a proof that it won't happen...
 
And define "completely eliminated". This is not how electronics work. This problem has already manifested, so it is there - that's the only sure thing.
So you can't say that a problem has been eliminated just with stability tests. Lowering voltage might only make this less probable.

Just voltage. And yes, I can. I think the SOC voltage out the gate is set too low for the quality of silicon they have (note that 1.2v is signifigantly higher than stock SOC voltage). This isn't really something new and novel. Yes the problem is still there but the problem is effectively eliminated for my practical purposes.
 
This bug seems to be a silicon quality issue honestly.

I was terribly plagued by this and found that a final 1.2v SOC completely eliminated the bug in all forms and test suites for me.

Weird.
As many have reported, the first thing AMD's support tell them is to increase the voltage. Many of them have increased it way beyond what you have, and the problem is still not completely gone. You will reach dangerous voltages before you can get high enough. Increasing the voltage also significantly impacts the lifespan of the chip, which means than any stability issues (including this one) will be more likely to occur over time.

All chips seems to have the potential, but silicon quality seems to play a factor in how likely it is to occur. As you know, bumping the voltage does lower the rise/fall time of the transistors, but it's still not enough to guarantee synchronicity, and would not eliminate all disturbances towards the end of a cycle. A proper fix would require a realignment of the circuits in this region of the CPU.
 
As many have reported, the first thing AMD's support tell them is to increase the voltage. Many of them have increased it way beyond what you have, and the problem is still not completely gone. You will reach dangerous voltages before you can get high enough. Increasing the voltage also significantly impacts the lifespan of the chip, which means than any stability issues (including this one) will be more likely to occur over time.

All chips seems to have the potential, but silicon quality seems to play a factor in how likely it is to occur. As you know, bumping the voltage does lower the rise/fall time of the transistors, but it's still not enough to guarantee synchronicity, and would not eliminate all disturbances towards the end of a cycle. A proper fix would require a realignment of the circuits in this region of the CPU.

There's a difference between the SOC (basically uncore) voltage and core voltage though. Are they telling them to increase SOC voltage at all? I'm unsure if I discovered something new or not. Everything I read tells that AMD support tells them to lower SOC voltage to stock, I'm doing the opposite.
 
Linux is a kernel. WSL implements a Linux-compatible userland which makes Linux applications able to run on the Windows kernel. There is no Linux code in Windows.


ASLR is a software implementation in the kernel. The problems have been reproduced with ASLR disabled, but the amount of occurrences might be slightly reduced, since ASLR increases the stress on the prefetcher.

The errors in the uOP cache is clearly a corruption happening inside the CPU core, micro operations are generated in the front-end/prefetcher, and since the hardware detects these there it's clearly a hardware bug.
thanks for clarifying.
 
Back
Top