• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Intel Statement on 13th and 14th Gen Core Instability: Faulty Microcode Causes Excessive Voltages, Fix Out Soon

I know that my confidence in the Intel brand has been shaken.
 
The workload absolutely matters, and I think everyone knows that. Using consumer grade hardware for server use isn't going to cause legal problems for Intel.
Server grade hardware is certified for high load 24/7 for x years. Running a private Minecraft server isn't exactly something that causes sustained peak load all the time (but hosting a lot of them for a company may be a lot more). Running a machine 24/7 without a significant load isn't much of a problem, I've been doing that for most of my machines for over a decade, and like many others in the industry I too have a home "server" for files/media/git/building/etc., but it's nothing with sustained high load, if so I would have to choose hardware accordingly.
Why workload doesn't matter :
From what I understand, and from CPU only point of view (because topic) - there is no "time to fail" timeframe.
You can run whatever thing you want, for however long you want (using validated hardware and settings).
The only limitation from Intel side, is warranty period that CPU is guaranteed to work (with replacement option if it doesn't hold up that long).
If you can provide some Intel documents, that state "running X program will decrease your warranty due to CPU wear", or "using this consumer CPU for XYZ task voids your warranty on it", we can get somewhere - otherwise, there is no point.

Again, x86 standard was made for a reason (both from hardware and software perspective).
"Consumer grade" hardware is made to work everywhere, regardless of use case.
"Prosumer" stuff get's extras that are wanted/required by companies (to guarantee whatever they need/want out of the hardware), and simplify process of making Intel allegeable for damages (if something goes wrong on their end).
The former doesn't mean, Intel is immune to hardware failures of their own making on "consumer grade" hardware (which is what current situation is BTW). Process is just more lengthy, and more complicated in such cases.

Issue : Highest frequency on TVB, because of combination for high single core usage + low running temps = high chance of degradation issue due to very high VID requested by CPU itself to make itself "stable" under such conditions.
^This is not "bad use case", it is bad manufacturer practices.
IF manufacturer knows what it's doing, there are no "bad use cases".
 
Last edited:
What is going to happen now that the genie is out of the bottle?
- will Intel publish serial numbers of CPUs affected by overvoltage and/or oxidation so that owners could identify them and file RMA?
- will second hand market go completely bonkers, as no buyer will know for sure whether they buy an affected CPU?
- how many online gaming companies will switch to AMD systems?
- will confidence in Intel brand and reliability suffer?
- so many questions...
Im pretty sure that's why Panic Lake... er. Bartlett Lake is all of the sudden releasing. Also the 14901EK (LOL).

"Dont ask questions, here's the new cpu, no ecores, put it in there and don't ask questions"
 
I know that my confidence in the Intel brand has been shaken.
Yes, Intel have made mistakes. There was an Oxidation issue with an early batch of 13th gen chips in 2023. The issue was identified and corrected, so I don't think there should be any concerns about the quality of its CPUs. Intel did not show as much transparency as it should have done on the Oxidation issue, it should have announced it at the time. Things go wrong with CPUs. The Raptor Lake issues have been corrected by BIOS updates including those with microcode updates and so Intel say only one issue remains which will be resolved with an August BIOS update. Other than that, Lunar Lake is on schedule, Arrow Lake desktop and mobile is on schedule so the latter part of the year may be easier.
 
There was an Oxidation issue with an early batch of 13th gen chips in 2023. The issue was identified and corrected, so I don't think there should be any concerns about the quality of its CPUs. Intel did not show as much transparency as it should have done on the Oxidation issue, it should have announced it at the time. Things go wrong with CPUs. The Raptor Lake issues have been corrected by BIOS updates including those with microcode updates and so Intel say only one issue remains which will be resolved with an August BIOS update.
And you believe all of that? From the same company selling promising you 10nm was fine :wtf:
 
And you believe all of that? From the same company selling promising you 10nm was fine :wtf:

Nothing to see here guys, everything is fine....
1721853818429.png
 
There was an Oxidation issue with an early batch of 13th gen chips in 2023. The issue was identified and corrected, so I don't think there should be any concerns about the quality of its CPUs. Intel did not show as much transparency as it should have done on the Oxidation issue, it should have announced it at the time.
IMO, Intel hasn't said enough on the oxidization issue, there needs to be a full recall with serial numbers of affected CPU's.
I would expect a company the size of Intel who owns their own fabs to do this right and fix it as soon as possible, knowing which cpu's were affected.
Things go wrong with CPUs. The Raptor Lake issues have been corrected by BIOS updates including those with microcode updates and so Intel say only one issue remains which will be resolved with an August BIOS update.
Except things like oxidization at the fabrication level, cpu failures after a year, or cpu's failing new out of the box aren't normal things to go wrong.
The BIOS updates aren't a fix for cpu's already degraded, and baseline power limits means lesser than claimed performance. I'm surprised someone hasn't started up a class action yet.
Also we have yet to see if the August BIOS update is really a fix, I personally don't trust Intel on the microcode to be a real fix, just postponing degradation until out of warranty failure.
Other than that, Lunar Lake is on schedule, Arrow Lake desktop and mobile is on schedule so the latter part of the year may be easier.
That is fine for shareholders and OEM's, but not for consumers that have already gotten screwed with potentially defective cpu's, it's even worse for laptop owners as people are experiencing similar problems to desktop cpu's, Intel seems to have ignored that issue and instead blames OEM's.
 
Last edited:
These are consumer grade chips, not server grade at all.
The entry level server chips are called the E-2400 series, e.g. Xeon E-2488, which is the Xeon part that closely resembles i9-13900K/14900K, except for it lacking aggressive boost and voltage. If they had gone for proper server grade parts, they likely would never have seen these issues.
I must add that several outlets calls W680 boards "server grade", they are not, they are workstation boards. Don't get me wrong, they are good boards, but wouldn't stop the consumer CPUs from aging prematurely.

If we assume these i9s "age" 4-5x faster than expected due to too much voltage, then this would easily explain why people using these as "servers" would see them fail after ~3-6 months. This only tells us that they've gotten away with consumer grade hardware in the past, because CPUs from the past few generations have been very reliable.


We don't know what they did, and until we have evidence we shouldn't speculate.
What we should do instead is to encourage those with contacts within Intel to publicly address this more precisely;
- Which product ranges were affected?
- For how long did this problem happen?
- Was this limited to certain production lines or everything?

I didn't say they are server grade CPU's. But consumer grade CPU's shouldn't degrade anywhere remotely close to what's been happening. My point was these were sold directly by Intel, so they knew full well what they'll be used for. And with the Minecraft servers dying in a couple of months, it just looks bad on Intel that a 2 year old CPU didn't do through internal testing to figure out it's absolutely not cut out for these tasks.

Also as I mentioned earlier, it's pretty common for these game hosting servers to use consumer grade cpu's and Intel were perfectly happy to supply them with such. It's why AMD released EPYC 4004 so surely there's a market and this not entirely atypical, albeit rare compared to the server market as a whole.

W680 are workstation, not server. I agree with that.

Also, I'm not speculating. Intel obviously knew about the oxidization, because they fixed it themselves. But they never disclosed it, and there's proof of them denying RMA to many vendors even though they knew full well there's a good chance those chips might have been melons. To release a statement saying there was a case of oxidization in 2023 AFTER third party reviewers are suspecting it just looks dodgy
 
Last edited:
Intel make no reference to structural degradation in the statement on their website. They only talk about instability, a fault in the algorithm and the microcode update fixing the algorithm. Intel microcode updates are delivered as BIOS updates by their motherboard partners. The 'leading to their structural degradation over time' is an editorial comment.
Well old news now, but there was degridation as a result of the microcode algorythm issue and alongside that there was the and oxidation issue, that so for Intel have stated which batches were effected and over what period...
 
Wow, this thread hit a die-hard carb diet and fattened up like a 14-tub lard!

Nothing new though... Intel has been recently pushing to scrape off every last cent of perf to champion the race. Those high power consumption numbers on full throttle were always concerning and performance degradation was highly probable - it was only a matter of time for the shit to hit the fan. Its unfortunate though... there was no need to push the already lavish perf we're seeing from both camps. For gamers, even 12th gen (or Zen 3 X3D) is a blast.
 
Why workload doesn't matter :
From what I understand, and from CPU only point of view (because topic) - there is no "time to fail" timeframe.
You can run whatever thing you want, for however long you want (using validated hardware and settings).
The only limitation from Intel side, is warranty period that CPU is guaranteed to work (with replacement option if it doesn't hold up that long).
If you can provide some Intel documents, that state "running X program will decrease your warranty due to CPU wear", or "using this consumer CPU for XYZ task voids your warranty on it", we can get somewhere - otherwise, there is no point.

Again, x86 standard was made for a reason (both from hardware and software perspective).
"Consumer grade" hardware is made to work everywhere, regardless of use case.
"Prosumer" stuff get's extras that are wanted/required by companies (to guarantee whatever they need/want out of the hardware), and simplify process of making Intel allegeable for damages (if something goes wrong on their end).
The former doesn't mean, Intel is immune to hardware failures of their own making on "consumer grade" hardware (which is what current situation is BTW). Process is just more lengthy, and more complicated in such cases.

Issue : Highest frequency on TVB, because of combination for high single core usage + low running temps = high chance of degradation issue due to very high VID requested by CPU itself to make itself "stable" under such conditions.
^This is not "bad use case", it is bad manufacturer practices.
IF manufacturer knows what it's doing, there are no "bad use cases".
^^ this

You know you can get OS to show you 100% utilization even when it's at lighter workload, right? FPU stress test or AVX workload is different workload to gaming, yet gaming might easily show 100%.

Workload theory would not stand a chance in a court. Simply, it's nowhere explicitly defined. What is the level of light and heavy workload? What instruction sets?
Is heavy workload that scenario, in which CPU keeps getting into temps above 100°C for specified amount of time? What if the cooler is not installed properly? You can get high temps that way even with light workload. Let's monitor voltage, current and watts during a specified period of time. But, then, uhm ... this might be heavy workload for Intel CPU but AMD might handle it with much less resources. So what now? Well, it comes to a workload definition per chip basis - a different definition for every one CPU SKU, because with different resources (cores, clocks) the workload limits would change as well. It is not defined and I doubt that someone would be crazy enough to try to do it. (But I might be wrong. Please post some link if such definitions do exist.)

Again, there is nothing wrong in using non-server (consumer grade) CPU for a server tasks. It may be dumb but it's not wrong. However, it must endure anyway, be it consumer grade or not. It must have implemented various sorts of safeguards to prevent itself from being damaged by ANY type of workload. That's what current, thermal and power draw protections are there for. If under any workload such protection allows CPU to get degraded, well, then it's a shitty/pointless protection to me.

If there is enough knowledge about a particular CPU (or process node) that it is prone to degrade when stressed by more then 1.45V during XY minutes, then do something yourself to avoid it at all costs and don't do it by passing the responsibility on to a users or motherboard manufacturers. Apply some sort of "workload throttling" that would decrease clocks and voltages, so that the stress put onto the CPU is lowered after some time. But hey, isn't that what PL1 and PL2 are (kind of) there for?
Let us shadow the AMD and dominate the every benchmark by setting the limits to infinity (and beyond)! And then the shit hits the fan ...

Regardless of the fact that they used non-server CPU as a server CPU for game hosting service, the CPU was configured/allowed to operate in critical conditions by the manufacturer itself.
Well, Intel was supporting the whole Extreme profile theory for a while now and has been giving the motherboard makers free hand. This whole scandal was about to happen.

intel.png
 
Last edited:
Intel should do a recall, not ask their customer to go to the service center since this is a widespread problem. And I still doubt power and voltage limits are the main culprits here. These were probably implemented as preventative measures to not worsen/ accelerate the issue. If this is a node/ fab issue, I think Intel will want to cover it up as much as possible because this will severely impact their CPU and fab business at the same time. So better to damage 1 than both of their main business.

I didn't say they are server grade CPU's. But consumer grade CPU's shouldn't degrade anywhere remotely close to what's been happening. My point was these were sold directly by Intel, so they knew full well what they'll be used for. And with the Minecraft servers dying in a couple of months, it just looks bad on Intel that a 2 year old CPU didn't do through internal testing to figure out it's absolutely not cut out for these tasks.

Also as I mentioned earlier, it's pretty common for these game hosting servers to use consumer grade cpu's and Intel were perfectly happy to supply them with such. It's why AMD released EPYC 4004 so surely there's a market and this not entirely atypical, albeit rare compared to the server market as a whole.

W680 are workstation, not server. I agree with that.

Also, I'm not speculating. Intel obviously knew about the oxidization, because they fixed it themselves. But they never disclosed it, and there's proof of them denying RMA to many vendors even though they knew full well there's a good chance those chips might have been melons. To release a statement saying there was a case of oxidization in 2023 AFTER third party reviewers are suspecting it just looks dodgy
The way Intel is handling (or mishandling) this issue will come back and bite them hard. Acknowledging that there were known issues with their chips at some point in time, but rejecting RMA is bad after sales service and shady business tactics. And in this case, they were forced to come up with some sort of official statement because the issue is getting out of hand, prompting many people to start testing for the root cause of the issue. Those notable posts from game developers claiming high repeatable failure rate of Intel CPUs are very damaging for their chip business. I feel Intel is trying to not implicate their fab business in case this is truly a fab issue. Otherwise their efforts to try and get more companies to use their fab will fail. I feel we are like watching a "car crash" in slow motion here.
 
Last edited:
Workload theory would not stand a chance in a court. Simply, it's nowhere explicitly defined.

Again, there is nothing wrong in using non-server (consumer grade) CPU for a server tasks. It may be dumb but it's not wrong. However, it must endure anyway, be it consumer grade or not. It must have implemented various sorts of safeguards to prevent itself from being damaged by ANY type of workload. That's what current, thermal and power draw protections are there for. If under any workload such protection allows CPU to get degraded, well, then it's a shitty/pointless protection to me.
Exactly. Intel itself does not say that consumer CPU's used in servers would invalidate their warranty or somehow be more unstable than server CPU's.
The main differentiating factors between consumer and enterprise CPU's are supported features, level of support and performance. Not their ability to run a certain workload for xx minutes on a specific SKU without failure with some kind of MTBF metric.
Intel should do a recall,
I feel we are like watching a "car crash" in slow motion here.
Indeed. If degradation really is happening due to excessive voltage and microcode like Intel says then ALL 13th and 14th gen CPU's that are in use are already degraded. It's not a matter of IF they will fail, it's a matter of WHEN they will fail. Any further fixes to voltage only slows down further degradation but the damage is already done. There is no way to reverse the damage that has already been done at silicon level.
 
and yet my 6 year old i9 7900x Skylake+++ keeps humming along.
10 cores, 4 memory channels, AVX512, Even Plays Ultra BluRay disks
 
Indeed. If degradation really is happening due to excessive voltage and microcode like Intel says then ALL 13th and 14th gen CPU's that are in use are already degraded. It's not a matter of IF they will fail, it's a matter of WHEN they will fail. Any further fixes to voltage only slows down further degradation but the damage is already done. There is no way to reverse the damage that has already been done at silicon level.
You're probably right. Such level of microcode problem theoretically allows for all those CPUs to be affected by degradation. Some samples might sustain more voltage, some less. I would compare it to problem with AMD's X3D SoC voltage problem. Yes, back then the motherboard manufacturers failed and implemented EXPO profiles incorrectly but the point is that not every X3D CPU was affected. However, I'm afraid that in this Intel's case it is incomparable to AMD's in terms of failure rate, so it's much more worse.

Personally, I'd prefer getting back money to getting CPU replacement. It's been multiple times proven that there is more than a slight chance than you will get another affected CPU as a result of the RMA process. In some extreme cases those CPUs died within a week while the old chips started to malfunction after few months.

I just don't get it. This whole thing is about quality assurance - testing, testing, testing. I thought that it was impossible for such issue to emerge past 90's. The QA has really evolved since then. Intel just needed to thoroughly test few SKUs. That actually seems to be quite an easy task compared to AMD's X3D SoC voltage issue - testing the X3D CPUs in all the supported motherboards is near to impossible.
 
and yet my 6 year old i9 7900x Skylake+++ keeps humming along.
10 cores, 4 memory channels, AVX512, Even Plays Ultra BluRay disks
Well it also had a MSRP of $999 along with expensive X299 motherboard and quad channel RAM is running only at DDR4-2666 speeds.
Not sure how a CPU helps play BluRay but ok.
 
So how is that W series and other locked chipsets are also killing these CPUs
They use the same microcode, the microcode reports the VID (voltage request) to the motherboard.

So in summary, they were over volting CPUs (interesting as we have been discussing that board vendors have been applying under volts), and as a result "some" chips have degraded to the point they are unstable, Intel will grant RMA for these chips.

I would like to see a number of years added to that statement for how long they will approve RMA for, because in my opinion it needs to probably be 5 years minimum. Not a joke 1-2 years.

Luckily I under volt my chip on both cache and core, it is working fine currently.

But "6Ghz" has never been normal. Why would you go with something absolutely brand new & having basically zero reliability or history to lean on :wtf:

The more you think this through the more you'll realize the guys doing this are a lot more at fault that you'd admit to!
You can apply this to every tech produced, most PC related hardware is "trying something new" when it first appears.

If I didnt buy from a company again after I (or friend) had problems, I would have blacklisted the following companies. There may be more, these are the ones I most easily remember.

AMD - FX degradation making stock unstable
Asus - Failed capacitors and unstable voltages in their "asus optimised" bios.
Asrock - Unsafe voltages in their bios when activating XMP and stock settings exceeding tjmax spec.
BenQ - Monitor with flawed display port (display wont wake up if turned off whilst using display port, system needs to be rebooted to wake up the port on the monitor).
Crucial - SSD shipped with flawed firmware.
EVGA - GPU shipped with unstable v/f curve out of the box.
Gigabyte - GPU shipped with unstable v/f curve out of the box on performance bios.
Kingston - Numerous SSDs failing.
Viewsonic - Failed monitor, and Monitor RMA switcheroo was a replacement with same fault.
 
Last edited:
I'm soooooo freaking happy that I didn't jump on the Intel bandwagon when I bought me a new computer in February of 2023.

My AMD Ryzen 7 7700 (non-X) CPU does everything I use my computer for pretty nicely with little power draw.
 
I'm soooooo freaking happy that I didn't jump on the Intel bandwagon when I bought me a new computer in February of 2023.

My AMD Ryzen 7 7700 (non-X) CPU does everything I use my computer for pretty nicely with little power draw.

Yep, i share the same thoughts. I've been on "intel inside" since the days of yore and my 'primary' gaming and work builds have always maintained that principle (not necessarily as a loyalist, but a familiar-ist). My first venture with AMD was in Apr 24 for a dedicated gaming rig housing a 5800X3D. That was built on a ditched AM4 build. CPU bound X3D uplift and efficiency sold it for me. A life-time of habitual Intel fidelity is a tough one to let go... esp. with zero complaints/regrets. It's just nice to see both camps are competing at the top of their game even if momentarily they get punched in the teeth. It is what it is, tough compo focusing on small pockets of triumphant oomph unfortunately at times at the cost of healthy competition. I'm not one for the latest and best and definitely not an early adopter... so can't really complain about falling victim to any of Intel/AMDs fleeting misadventures.

IMO, we're good in the CPU space. Its the GPU realm where i'd like to see AMD and/or Intel kick arse! Nvidia's got it too good, a license to bleed the consumer which is never a good thing.
 
Back
Top