Thursday, March 16th 2017

AMD Ryzen Machine Crashes to a Sequence of FMA3 Instructions

An AMD Ryzen 7-1800X powered machine was found to be crashing upon execution of a very specific set of FMA3 instructions by Flops version 2, a simple open-source CPU benchmark by Alexander "Mystical" Yee. An important point to note here is that this little known benchmark has been tailored by its developer to be highly specific to the CPU micro-architecture, with separate binaries for each major x64 architecture (eg: Bulldozer, Sandy Bridge, Haswell, Skylake, etc.), and as such the GitHub repository does not have a "Zen" specific binary.

Members of the HWBot forums found that Ryzen powered machines crash on running the Haswell-specific binary, at "Single-Precision - 128-bit FMA3 - Fused Multiply Add." The Haswell-specific binary (along with, we imagine, Skylake), adds support for the FMA3 instruction-set, which Ryzen supports, and which lends some importance to the discovery of this bug. What also makes this important is because a simple application, running at user privileges (i.e. lacking special super-user/admin privileges), has the ability to crash the machine. Such a code could even be executed through virtual machines, and poses a security issue, with implications for AMD's upcoming "Naples" enterprise processor launch.
Add your own comment

62 Comments on AMD Ryzen Machine Crashes to a Sequence of FMA3 Instructions

#1
RejZoR
If this benchmark things are tailored to such specific level that they differentiate even SERIES within SAME VENDOR, why the hell is this a news?
Posted on Reply
#2
Nkd
RejZoR said:
If this benchmark things are tailored to such specific level that they differentiate even SERIES within SAME VENDOR, why the hell is this a news?
I swear TPU posts anything and everything. Saying it doesn't have Zen specific binaries but hey there might be a bug. Really? Sometimes I have a hard time even believing I am reading this here on this site. Post anything and everything. Systems are rock stable with every software and then making this a news is just trolling. lol
Posted on Reply
#3
the54thvoid
RejZoR said:
If this benchmark things are tailored to such specific level that they differentiate even SERIES within SAME VENDOR, why the hell is this a news?
I found when i subjected my Intel Sandy-E system to a water dousing test from an unplugged water block it also failed.

Morale of story, the specific open source bench, as alluded to in OP hasn't got the Zen instruction set yet....
Posted on Reply
#4
btarunr
Editor & Senior Moderator
RejZoR said:
If this benchmark things are tailored to such specific level that they differentiate even SERIES within SAME VENDOR, why the hell is this a news?
The haswell-specific bench runs an FMA3 industry-standard instruction, which takes down the FMA3-supporting Ryzen (and not FMA3-supporting Skylake).

This is news because an unprivileged application can take down a machine (and is hence a security hole). Would a company like Barclay's put its client live database on a "Naples" machine now?
Posted on Reply
#5
RejZoR
btarunr said:
The haswell-specific bench runs an FMA3 industry-standard instruction, which takes down the FMA3-supporting Ryzen (and not FMA3-supporting Skylake).

This is news because an unprivileged application can take down a machine (and is hence a security hole).
I'm pretty sure you can crash ANY system by feeding it with instructions that are not meant for it. And we know how "standards" work with instructions. If they really were 100% standard, then they'd exhibit IDENTICAL performance gains on ALL CPU's. Which we know for a fact it's not true...
Posted on Reply
#6
Mussels
Moderprator
People are missing the point: A program can be coded to crash zen. Someone could throw that code into a website ad for example, and *bam* AMD stocks plummet.

Hopefully a BIOS or windows security update can fix this one before it goes bad.
Posted on Reply
#7
silentbogo
Tesla Model S won't run on Diesel!
btarunr said:
The haswell-specific bench runs an FMA3 industry-standard instruction, which takes down the FMA3-supporting Ryzen (and not FMA3-supporting Skylake).

This is news because an unprivileged application can take down a machine (and is hence a security hole). Would a company like Barclay's put its client live database on a "Naples" machine now?
And I quote:
Wikipedia
AMD explicitly revealed that Zen, its 3rd-generation x86-64 architecture in its first iteration (znver1 – Zen, version 1); would drop support for FMA4 in a patch to the GNU Binutils package.[13] There has been initial confusion regarding whether FMA4 was implemented or not due to errata in the initial patch that has since then been rectified.[14]
Posted on Reply
#8
the54thvoid
btarunr said:
The haswell-specific bench runs an FMA3 industry-standard instruction, which takes down the FMA3-supporting Ryzen (and not FMA3-supporting Skylake).

This is news because an unprivileged application can take down a machine (and is hence a security hole). Would a company like Barclay's put its client live database on a "Naples" machine now?
Not fair comparison at all. By the news post itself, "a little known open source program" designed my one guy, wouldn't be used by Barclays.

@Mussels, everything can be crashed. Especially on such esoteric and unique program with a specific instruction set.

I'm not saying it's not an issue but it's very specific and very minor. Every major operating system has almost weekly vulnerability exposed.
Posted on Reply
#9
Taloken
From the hwbot thread, a fix is coming with a new microcode. Also disabling SMT prevent the crash.
Posted on Reply
#10
Super XP
This isn't news, it's nonsense.
Now we have Intel fan boys on wccftech spreading rumors of Ryzen being a design flaw due to this article lol, ridiculous.
Posted on Reply
#11
btarunr
Editor & Senior Moderator
the54thvoid said:
Not fair comparison at all. By the news post itself, "a little known open source program" designed my one guy, wouldn't be used by Barclays.
No, my point is the disgruntled IT guy Barclay's just fired could crash a "Naples" powered server with just this "little known program."
Posted on Reply
#12
Jack1n
It's funny how people seem to be missing the point in this article, anyway, I hope AMD is able to fix this.
Posted on Reply
#13
W1zzard
RejZoR said:
I'm pretty sure you can crash ANY system by feeding it with instructions that are not meant for it.
No you can't. Your application will crash and that's it.
Posted on Reply
#14
behrouz
The Stilt

The issue with Flops was found and fixed in the beginning of february.
The current µcode version dates to 01/27/2017, so the fix is obviously not included yet (due to the time required for validation).
Flops is only affected when the SMT is enabled, so disabling the SMT can be used as a temporary work-around (until the actual fix arrives).
Source
Posted on Reply
#15
RejZoR
btarunr said:
No, my point is the disgruntled IT guy Barclay's just fired could crash a "Naples" powered server with just this "little known program."
It's a bit disingenuous to create drama over a bug (I'm not going to deny that!) and ignoring the fact the fix exists (as posted by @behrouz above) , but hasn't been pushed out yet because of required testing procedures. It's why I questioned newsworthiness of this bug...
Posted on Reply
#16
RejZoR
W1zzard said:
No you can't. Your application will crash and that's it.
There is always a way, they just haven't found it yet...
Posted on Reply
#18
kn00tcn
Mussels said:
People are missing the point: A program can be coded to crash zen. Someone could throw that code into a website ad for example, and *bam* AMD stocks plummet.

Hopefully a BIOS or windows security update can fix this one before it goes bad.
what? web code doesnt run native like that...
Posted on Reply
#20
Vinska
RejZoR said:
I'm pretty sure you can crash ANY system by feeding it with instructions that are not meant for it.
Nope, feeding a CPU with instructions not meant for it simply makes the CPU issue an illegal instruction fault, which normally results in the program terminating (i.e. "crashing"). If that happens in kernelspace code, that usually means the whole [virtual] machine "chrashes", but on userspace code, that should normally only kill the offending process. Meanwhile, if a userspace program can bring the whole system down, that is quite abnormal.

FWIW, I doubt this is something that can't be simply fixed with a microcode update.
After all, every CPU ends up with hundreds of errata, some a lot scarier than simple DoS such as this one.
Posted on Reply
#21
R0H1T
btarunr said:
No, my point is the disgruntled IT guy Barclay's just fired could crash a "Naples" powered server with just this "little known program."
Someone running Naples will likely have their own application coded to run on the Ryzen server, they don't just copy/paste the aforementioned code to run on their application & crash (test) a server. Then there's also app & OS specific safeguards that usually prevents system crash, like the sandboxing in chrome or any number of OS safeguards under Windows. Mind you linux is generally more secure (IMO) & most servers run on linux, not to mention running code locally with elevated privilege isn't as simple on linux. Someone also said that web code doesn't crash OS just like that, so even for web requests you'd have to do some major goofup to let this crash a system.
Posted on Reply
#22
darkangel0504
. But for some reason, it only affects this particular benchmark. Other programs (like prime95 and y-cruncher) aren't affected despite using FMAs.
.
Posted on Reply
#23
_JP_
btarunr said:
No, my point is the disgruntled IT guy Barclay's just fired could crash a "Naples" powered server with just this "little known program."
Well, maybe that IT guy wasn't very good at it anyway because he didn't blacklist executables that shouldn't run on a production server in the first place...and this a finance-related server you're making an example about. :)
This news isn't a big deal...
Posted on Reply
#25
EarthDog
btarunr said:
The haswell-specific bench runs an FMA3 industry-standard instruction, which takes down the FMA3-supporting Ryzen (and not FMA3-supporting Skylake).

This is news because an unprivileged application can take down a machine (and is hence a security hole). Would a company like Barclay's put its client live database on a "Naples" machine now?
What would be more interesting is to hear about it crashing on consumer stress tests which use that instruction set....
Posted on Reply
Add your own comment