Monday, January 2nd 2017

Intel Secretly Firefighting a Major CPU Bug Affecting Datacenters?

There are ominous signs that Intel may be secretly fixing a major security vulnerability affecting its processors, which threatens to severely damage its brand equity among datacenter and cloud-computing customers. The vulnerability lets users of a virtual machine (VM) access data of another VM on the same physical machine (a memory leak). Amazon, Google, and Microsoft are among the big three cloud providers affected by this vulnerability, and Intel is reportedly in embargoed communications with engineers from the three, to release a software patch that fixes the bug. Trouble is, the patch inflicts an unavoidable performance penalty ranging between 30-35%, impacting the economics of using Intel processors versus AMD ones.

Signs of Intel secretly fixing the bug surfaced with rapid changes to the Linux kernel without proper public-visibility of the documentation. The bulk of the changes involve "kernel page table isolation," a feature that prevents VMs from reading each other's data, but at performance costs. Developers note that these changes are being introduced "very fast" by Linux kernel update standards, and even being backported to older kernel versions (something that's extremely rare). Since this is a hardware vulnerability, Linux isn't the only vulnerable software platform. Microsoft has been working on a Windows kernel patch for this issue since November 2017. AMD x86 processors (such as Opteron, Ryzen, EPYC, etc.,) are immune to this vulnerability.
Source: Reddit
Add your own comment

53 Comments on Intel Secretly Firefighting a Major CPU Bug Affecting Datacenters?

#26
biffzinker
btarunr said:
Literally every Intel processor supporting virtualization (VT-x). That's pretty much every Core, Pentium, Celeron, and Xeon processor launched since 2007.
Atom too I'd imagine?
Posted on Reply
#27
RejZoR
One of two major CPU makers is enjoying this news XD Although I don't think this will affect Intel long term. They are just so sleazy they always get it their way no matter the size of a cockup. Just like NVIDIA. But if AMD had such cockup, it would basically be their end, because everyone would be endlessly pissy about it, even on platforms not even affected by it (sort of like drivers of hatchbacks and minivans refused to buy Pirelli tires because they had some issues in F1 one season)...
Posted on Reply
#28
eidairaman1
The Exiled Airman
biffzinker said:
Atom too I'd imagine?
I guess ATOMS could actually get slower considering they were slower than the fastest P3/K6-2/3 and the slowest P4/Duron...
Posted on Reply
#29
btarunr
Editor & Senior Moderator
biffzinker said:
Atom too I'd imagine?
If it supports VT-x, it's vulnerable.
Posted on Reply
#30
theGryphon
I just looked at Phoronix benchmarks for pre- and post- the fix, the performance hit does not seem to be across the board, but applies to only some types of loads.

I'm starting to think it will seriously impact those large corporations with datacenters/farms but not nearly as much the regular users... Still, Intel is gonna bleed some significant cash.
Posted on Reply
#31
eidairaman1
The Exiled Airman
theGryphon said:
I just looked at Phoronix benchmarks for pre- and post- the fix, the performance hit does not seem to be across the board, but applies to only some types of loads.

I'm starting to think it will seriously impact those large corporations with datacenters/farms but not nearly as much the regular users... Still, Intel is gonna bleed some significant cash.
Since enterprise is their biggest payers compared to us "peons" yes it certainly will
Posted on Reply
#32
R0H1T
theGryphon said:
I just looked at Phoronix benchmarks for pre- and post- the fix, the performance hit does not seem to be across the board, but applies to only some types of loads.

I'm starting to think it will seriously impact those large corporations with datacenters/farms but not nearly as much the regular users... Still, Intel is gonna bleed some significant cash.
The I/O numbers are heavily affected i.e. big for servers & every enterprise customer. What's not shown though is the extra CPU cycles, so the penalty could be higher in terms of perf/W or even absolute performance.
Posted on Reply
#33
thesmokingman
Reading what's been written regarding speculation and Intel's executions cheating for lack of a better word, not waiting for security checks before executing code ie. accessing kernel memory from user mode, this is seriously not kosher. I also read this "feature" of their highly speculative execution is in their white papers. No wonder their chips are, er were so fast?
Posted on Reply
#34
R0H1T
thesmokingman said:
Reading what's been written regarding speculation and Intel's executions cheating for lack of a better word, not waiting for security checks before executing code ie. accessing kernel memory from user mode, this is seriously not kosher. I also read this "feature" of their highly speculative execution is in their white papers. No wonder their chips are, er were so fast?
What's interesting is how this potentially disastrous flaw could affect old OS, ATM's anyone? Unpatched systems like win7 or older (govt)infrastructure & the much bigger financial sector could be at serious risk!
Posted on Reply
#35
_Flare
https://translate.google.com/translate?sl=de&tl=en&js=y&prev=_t&hl=de&ie=UTF-8&u=https%3A%2F%2Fwww.computerbase.de%2F2018-01%2Fintel-cpu-pti-sicherheitsluecke%2F&edit-text=&act=url

Cite:
" ... in all Intel CPUs. Apparently, a process can make an Intel CPU exploit a hardware bug speculatively load memory areas and then allow access to it without further testing, without the process having the necessary rights. This allows an unprivileged process to access the memory of the kernel, which can contain sensitive data. This is especially precarious for cloud providers such as Amazon and Google, who want to prevent breaking out of virtual machines. In addition, the kernel's Address Space Layout Randomization (ASLR) security technique, which is used as defense in depth, could be compromised."
Posted on Reply
#36
Jism
theGryphon said:
Beginning of the end for Intel?
Hell no. Intel makes billions and owns 80% minimum of any market in X64/X86 market. But it is going to cost them ALOT of money, considering how many Intel CPU's are active in DC's all over the globe. This affects performance and people could state we bought the CPU for performance and this patch is just killing it.

Intel needs to go back to drawing board, and test their premium CPU's accordingly. Stop making things so complex with IME.
Posted on Reply
#37
eidairaman1
The Exiled Airman
Jism said:
Hell no. Intel makes billions and owns 80% minimum of any market in X64/X86 market. But it is going to cost them ALOT of money, considering how many Intel CPU's are active in DC's all over the globe. This affects performance and people could state we bought the CPU for performance and this patch is just killing it.

Intel needs to go back to drawing board, and test their premium CPU's accordingly. Stop making things so complex with IME.
IME Reminds me of SMB...

By the way

https://www.techpowerup.com/forums/threads/amd-lobbies-to-be-excluded-from-intel-vt-flaw-kernel-patches.240187/
Posted on Reply
#38
Imsochobo
btarunr said:
If it supports VT-x, it's vulnerable.
No, it's not related to virtualization at all.
Posted on Reply
#39
Prima.Vera
theGryphon said:
Beginning of the end for Intel?
:laugh::laugh::laugh::laugh: You funny guy :roll::roll::roll::roll:
Posted on Reply
#40
Parn
Prima.Vera said:
We need more details regarding this.
Personally I don't want an Windows update to gimp my CPU performance just because it might have a memory leak if I run a VM software. Screw that. I'm not using my desktop to run VMs anyways, or if I do, is for my own personal access anyways.
Same here. I'll postpone all Windows Update until I figure out which patch (KBxxxx) is targeted at this issue then hide it. As for my Fedora desktop and CentOS home server I'll just exclude kernel update from my monthly dnf/yum runs.
Posted on Reply
#41
theoneandonlymrk
R0H1T said:
IMO such a large flaw is only disclosed/fixed when vendors are forced to. So either ~

a) some white hat was gonna disclose this exploit pretty soon.
b) there are a sizable number of exploits in the wild, so this needs to be patched asap.

OR all of the above.
Exactly , take this and IME and intel have a busy period ahead of them.
On the positive i can't wait for five years when the ramifications of all this are buyable in silicon:),by me , what with the competition in the field people are going to have to do some extraordinarily good work to compete.
Posted on Reply
#42
Darmok N Jalad
I guess this is one way to help with slumping PC sales. I’m sure Gen 9 Core will fix this, but magically require a new chipset too. I wonder if the patch will ever get sorted out so unaffected CPUs don’t see the performance hit, or if that will only work when “genuine Gen9 Core” is detected. I know that sounds a bit ranty, but it’s been done before.
Posted on Reply
#43
R0H1T
Darmok N Jalad said:
I guess this is one way to help with slumping PC sales. I’m sure Gen 9 Core will fix this, but magically require a new chipset too. I wonder if the patch will ever get sorted out so unaffected CPUs don’t see the performance hit, or if that will only work when “genuine Gen9 Core” is detected. I know that sounds a bit ranty, but it’s been done before.
This is a design flaw, it won't get fixed with a mild upgrade from gen 8 to gen 9, assuming gen 9 is even real & not the same respun SKL 8 cores aka WHL. They'll have to rewrite the memory subsystem or something, the only fix is a patched kernel atm.
Posted on Reply
#44
jahramika
R-T-B said:
That's what most errata is. It will likely require an on-silicon fix to not hurt performance.

AMD had a pretty big one around Ryzen launch too. It just only affected linux, so no one cared.
AMD Linux issue was fixable with a micro code adjustment. This is not a Micro Code Fix
Posted on Reply
#46
yogurt_21
seems like this would be a big headache for AWS, Azure, and google.

not so much if you host your own infrastructure. After all it appears you have to already have access to a vm on the machine to read the contents from another. Within an organization that's less of a concern than say me having an AWS node that can read the contents of another companies AWS node...
Posted on Reply
#48
cdawall
where the hell are my stars
R-T-B said:
I'm betting the number will go massively up following it's disclosure.
I mean for it to be relatively unseen in the past 10+ years is interesting to say the least. I wouldn't want to know the actual count on cpus affected.
Posted on Reply
#49
deu
Well... If you can write code to exclude AMD CPUs from running optimal instructions sets for a decade then you can exclude AMD CPUs from your own panic code when it comes to crippling CPU's due to issues. God damn I feel the karma is strong with this one. If Intel does not bow down an take the plow on this one they should get banned from EU servermarkets. This might be the thing that actually makes me go for AMD even though they might not have the best product for me. :S
Posted on Reply
#50
Blo3der-Kuh
The german website Computer Base just posted some benchmarks including Assassin's Creed: Origins which is said to be quite CPU hungry because of it's "interesting" copy protection.

They are using the latest Win10 Insider build which has the fix enabled. The test system consists of an i7-7700K and an Asus GeForce GTX 1080 Ti Strix.

See screenshot below or this link for all benchmarks. As expected performance in AC only decreases when the CPU is the limiting factor (low details, high framerates). This could mean that the impact is a lot higher on lower performing systems (e.g. i3 or Pentium processors) where the CPU is the bottleneck.

Posted on Reply
Add your own comment