Monday, April 29th 2024

Intel Statement on Stability Issues: "Motherboard Makers to Blame"

A couple of weeks ago, we reported on NVIDIA directing users of Intel's 13th Generation Raptor Lake and 14th Generation Raptor Lake Refresh CPUs to consult Intel for any issues with system stability. Motherboard makers, by default, often run the CPU outside of Intel's recommended specifications, overvolting the CPU through modifying voltage curves, automatic overclocks, and removing power limits.

Today, we learned that Igor's Lab has obtained a statement from Intel that the company prepared for motherboard OEMs regarding the issues multiple users report. Intel CPUs come pre-programmed with a stock voltage curve. When motherboard makers remove power limits and automatically adjust voltage curves and frequency targets, the CPU can be pushed outside its safe operating range, possibly causing system instability. Intel has set up a dedicated website for users to report their issues and offer support. Manufacturers like GIGABYTE have already issued new BIOS updates for users to achieve maximum stability, which incidentally has recent user reports of still being outside Intel spec, setting PL2 to 188 W, loadlines to 1.7/1.7 and current limit to 249 A. While MSI provided a blog post tutorial for stability. ASUS has published updated BIOS for its motherboards to reflect on this Intel baseline spec as well. Surprisingly, not all the revised BIOS values match up with the Intel Baseline Profile spec for these various new BIOS updates from different vendors. You can read the statement from Intel in the quote below.
Intel has observed that this issue may be related to out of specification operating conditions resulting in sustained high voltage and frequency during periods of elevated heat.

Analysis of affected processors shows some parts experience shifts in minimum operating voltages which may be related to operation outside of Intel specified operating conditions.

While the root cause has not yet been identified, Intel has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards.

Intel has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency, for example:
  • Disabling Current Excursion Protection (CEP)
  • Enabling the IccMax Unlimited bit
  • Disabling Thermal Velocity Boost (TVB) and/or Enhanced Thermal Velocity Boost (eTVB)
  • Additional settings which may increase the risk of system instability:
  • Disabling C-states
  • Using Windows Ultimate Performance mode
  • Increasing PL1 and PL2 beyond Intel recommended limits
Intel requests system and motherboard manufacturers to provide end users with a default BIOS profile that matches Intel recommended settings.

Intel strongly recommends customer's default BIOS settings should ensure operation within Intel's recommended settings.

In addition, Intel strongly recommends motherboard manufacturers to implement warnings for end users alerting them to any unlocked or overclocking feature usage.

Intel is continuing to actively investigate this issue to determine the root cause and will provide additional updates as relevant information becomes available.

Intel will be publishing a public statement regarding issue status and Intel recommended BIOS setting recommendations targeted for May 2024.
Source: Igor's Lab
Add your own comment

272 Comments on Intel Statement on Stability Issues: "Motherboard Makers to Blame"

#151
Tek-Check
dir_dClearly there is a problem if there is 6 pages of knowledgeable people bickering about what is the "Baseline"
The problem has existed for years, namely Intel refusing to tighten up the power speficifation of CPUs. It's convenient for them as this allows for flexibility in interpreting the spec by OEMs, something similar to what HDMI Forum has done with watering down the spec of HDMI 2.1...

They have been able to get away with this for the most part as soon as CPUs were stable enough and without rapid degradation over time. This time it's a bit different because people are returning top CPUs back to shops. With a permission from Intel, motherboard vendors seem to have simply overcooked it with profiles on steroids. And now they get the blame from their CPU supplier, which sounds to me like hypocrisy.

The bottom line is that a new motherboard should NEVER come with unlocked profiles on steroids as default, out-of-the-box experience. If the maximum Turno Boost is defined as 253W, that's what a default setting should always be for users to start with. Vendors can, of course, inform and educate the public that they could enable tweaked profiles with extended power range, higher voltage, etc., as an advanced option to be voluntarily enabled by PC users and not as a factory setting when you power a PC and start using it for the first time. It's nonsense what Intel has allowed OEMs to do.
Posted on Reply
#152
Solaris17
Super Dainty Moderator
Tek-CheckThe problem has existed for years, namely Intel refusing to tighten up the power speficifation of CPUs
This is wild. They literally post these specs on the intel ark pages. lol.
Posted on Reply
#153
user556
Intel always had control of this. Intel is the one who decided to keep loosening the limits for each new generation. The website specs were getting more and more hazy each generation. Who knows what Intel was feeding the board makers.

Conclusion: Intel owes end users a refund for false advertising.
Posted on Reply
#154
Darmok N Jalad
user556Intel always had control of this. Intel is the one who decided to keep loosening the limits for each new generation. The website specs were getting more and more hazy each generation. Who knows what Intel was feeding the board makers.

Conclusion: Intel owes end users a refund for false advertising.
Let's go all the way back to Comet Lake:
Note, when we asked Intel about why it doesn’t make these hard specifications and how we should test CPUs given that we’re somewhat enable to keep any motherboard consistent (it might change between BIOS revisions) for a pure CPU review, the response was to test a good board and a bad board. I think that on some level Intel’s engineers don’t realize how much Intel’s partners abuse the ability to set PL2 and Tau to whatever values they want.
The question was asked 4 years ago, and Intel shrugged.
Posted on Reply
#156
Crackong
Dr. DroThe data sheet is concise and complete regarding Tau length, recommended current and wattage for all models and specifications, you just need to know how to correlate the subtype with the marketed name. S-Processor 150 W means i9 KS, S-Processor 125 W means i9 K, etc.
Not really in a 'Exact and precise' way.

Just like what Buildzoid @Actually Hardcore Overclocking found out in this exact document.
Under 'VCCCORE DC Specifications'
Under 'Processor VCCCORE Active and Idle Mode DC Voltage and Current Specifications (S and S-Refresh Processor Line)'

The only thing being clearly specified is the Maximum value.
And in the Notes section,

Regarding 'reliability', Intel in point No.7, described 'reliability are not assured in conditions above or below Maximum/Minimum functional limits
And, While the maximum value is being specified by Intel, the minimum value is usually '--' and not clearly specified.
These reliability claims are basically useless when the so-called 'Minimum functional limits' do not exist in your specification.

Regarding the 'Recommended current' , Intel listed nothing but in point No.14, instructed the MB manufacturers to measure and set their own values, with the words 'A superior board design with a shallower AC Load Line can improve on power, performance and thermals compared to boards designed for POR impedance.'

Thus, the MB manufacturers had to figure out their own typical values, with Intel themselves encouraging 'Make more powerful VRM design then set a shallower LLC' design principle.

From Buildzoid's videos for his testing on Gigabyte and Asus 's baseline profiles,
We can see a trend of 'More LLC + lower Power limit' in these 'baseline' profiles.

Since they do solved some of the crashing problems buildzoid had.
'LLC being too low' must be one of the root cause of these problems.

However,
Since Intel does not provide minimum & typical values in these settings,
And they actively encouraged MB manufacturers to make products with shallower LLC.
Intel clearly deserves at least half of the blame.

It is not rational to put all the blame on MB manufacturers when your specification is

'Hey here is a thing , from 0-100 , figure out your own value, and we suggest you go lower'

And in reality, some of their CPUs lost the silicon lottery game, and randomly malfunctioned below 30.


Posted on Reply
#157
Dr. Dro
CrackongNot really in a 'Exact and precise' way.

Just like what Buildzoid @Actually Hardcore Overclocking found out in this exact document.
Under 'VCCCORE DC Specifications'
Under 'Processor VCCCORE Active and Idle Mode DC Voltage and Current Specifications (S and S-Refresh Processor Line)'

The only thing being clearly specified is the Maximum value.
And in the Notes section,

Regarding 'reliability', Intel in point No.7, described 'reliability are not assured in conditions above or below Maximum/Minimum functional limits
And, While the maximum value is being specified by Intel, the minimum value is usually '--' and not clearly specified.
These reliability claims are basically useless when the so-called 'Minimum functional limits' do not exist in your specification.

Regarding the 'Recommended current' , Intel listed nothing but in point No.14, instructed the MB manufacturers to measure and set their own values, with the words 'A superior board design with a shallower AC Load Line can improve on power, performance and thermals compared to boards designed for POR impedance.'

Thus, the MB manufacturers had to figure out their own typical values, with Intel themselves encouraging 'Make more powerful VRM design then set a shallower LLC' design principle.

From Buildzoid's videos for his testing on Gigabyte and Asus 's baseline profiles,
We can see a trend of 'More LLC + lower Power limit' in these 'baseline' profiles.

Since they do solved some of the crashing problems buildzoid had.
'LLC being too low' must be one of the root cause of these problems.

However,
Since Intel does not provide minimum & typical values in these settings,
And they actively encouraged MB manufacturers to make products with shallower LLC.
Intel clearly deserves at least half of the blame.

It is not rational to put all the blame on MB manufacturers when your specification is

'Hey here is a thing , from 0-100 , figure out your own value, and we suggest you go lower'

And in reality, some of their CPUs lost the silicon lottery game, and randomly malfunctioned below 30.


I mean, these are largely disclaimers anyway, but still a nice find. I'm fortunate to have a pretty awesome board with a direct 19-phase with 105A stages, so it's particularly well behaved on my end.
Posted on Reply
#158
nguyen
Now Intel sell extended warranty on their K/S chips, problem solved :rolleyes:
Posted on Reply
#159
Dr. Dro
nguyenNow Intel sell extended warranty on their K/S chips, problem solved :rolleyes:
Until 2021 or so they actually offered a "tuning protection plan", basically an $20 insurance premium for a one-time no questions asked replacement if you burned your CPU overclocking. Supposedly discontinued due to low demand.
Posted on Reply
#160
nguyen
Dr. DroUntil 2021 or so they actually offered a "tuning protection plan", basically an $20 insurance premium for a one-time no questions asked replacement if you burned your CPU overclocking. Supposedly discontinued due to low demand.
I bet that plan would sell like hot cakes now
Posted on Reply
#161
Sabotaged_Enigma
Dr. DroThe difference is that Bulldozer sucked, and Raptor Lake doesn't.
Yes true, but I'm talking about instability here.
Posted on Reply
#162
londiste
The interesting bit from der8auer's videos is that the real problem is undervolting. Not power limits or lack of them. CPUs can handle the power, if not they throttle etc. But what triggered the current set of problems is that boards seem to undervolt to a degree that makes CPUs unstable.
Posted on Reply
#163
Tek-Check
Solaris17This is wild. They literally post these specs on the intel ark pages. lol.
Posting specs has nothing to do with tightening up, as @Crackong explained in the post #157
Posted on Reply
#164
Zubasa
Darmok N JaladLet's go all the way back to Comet Lake:


The question was asked 4 years ago, and Intel shrugged.
Oh it goes even further back then that. The board venders have been progressively ignoring every limit one by one and Intel turned a blind eye every time until now.
gamersnexus.net/guides/3389-intel-tdp-investigation-9900k-violating-turbo-duration-z390
First it was the Turbo Duration, then the PL and Voltage, and most recently the current extrusion protection on 14th gen.
GIGABYTE Releases CEP Disable Option in BIOS Updates to its Intel Z790 and B760 Motherboards | TechPowerUp
Apperently it comes from a microcode released months ago, and Intel is the only one that can release microcodes for their CPUs.
The list goes on, as long as CB R23 score goes up Intel did not care. The CPU just needed to live long enough for the reviews.
londisteThe interesting bit from der8auer's videos is that the real problem is undervolting. Not power limits or lack of them. CPUs can handle the power, if not they throttle etc. But what triggered the current set of problems is that boards seem to undervolt to a degree that makes CPUs unstable.
MSI: Nope
videocardz.com/newz/msi-z790-max-bios-feature-increases-intel-cpu-throttling-temperature-to-115c
Users have to go out of their way to set the "stock" values on DIY boards.
www.techpowerup.com/review/intel-core-14th-gen-unboxing-preview/2.html
The Asus board used was kindy provided by Intel.
Intel Core i9-14900K Review - Reaching for the Performance Crown - Test Setup | TechPowerUp

Posted on Reply
#165
Outback Bronze
Apparently, this guy has had a fix for a while..


Me personally - Z790 Apex - 14900KS - All limits removed :)
Posted on Reply
#166
Assimilator
Outback BronzeApparently, this guy has had a fix for a while..


Me personally - Z790 Apex - 14900KS - All limits removed :)
The actual fix is for Intel to stop allowing motherboard manufacturers to play fast and loose with power settings to artificially boost Intel's benchmark scores so that its CPUs don't look like the total rubbish they actually are. As opposed to stupid workarounds like these which boil down to making Intel's problem the user's problem.

Unrelated, this is why I hate most "tech" YouTubers.
Posted on Reply
#168
dgianstefani
TPU Proofreader
Outback BronzeApparently, this guy has had a fix for a while..


Me personally - Z790 Apex - 14900KS - All limits removed :)
He's arrogant, but it's true that locking clocks and manually setting voltages fixes this and other issues. I recommend the same thing for Zen RAM tuning. People for example often don't know that XMP/EXPO only sets primary timings and motherboard will try to train secondaries every boot if you don't manually set them.

It's one of the few complaints I have with Zen X3D, very little you can do manually with clocks.

His attitude is a common one that people who manually overclock have towards automatic or "default" motherboard overclocking/tuning. Situations like these somewhat validate that opinion.
DavenLooking back at the frontpage polls from the last two years, AMD has been steadily increasing its market share of TPU readers.

Nov 2022 - AMD 60% Intel 40%
What CPU architecture do you use? | TechPowerUp Forums

Aug 2023 - AMD 70% Intel 30%
Are you using an AMD Ryzen X3D CPU with 3D V-Cache? | TechPowerUp Forums

I'm wondering if this debacle will continue to push the enthusiast DIY market towards AMD.
My dude, AMD has the same issues. Remember Meltdown?

AMD just has the benefit of underdog favoritism and a bunch of people who will defend them no matter what, disregarding the fact they are a multinational corporation who should and could do better.

It's almost a meme at this point how bad the first ~year of AGESA is for a new platform.
Posted on Reply
#169
Vya Domus
dgianstefaniIt's almost a meme at this point how bad the first ~year of AGESA is for a new platform.
It's a meme to you maybe, I haven't had any significant issues and I jumped on 7000 series pretty early on.
dgianstefaniAMD just has the benefit of underdog favoritism and a bunch of people who will defend them no matter
AMD also has the benefit of typically not lying and that helps a lot, so even if something goes wrong they don't get as much flack for it. Maybe Intel shouldn't have that said in the past that it's actually totally within spec to have a gazillion watt power limit on their CPU, they should have kept their mouths shut and now the narrative that it was the motherboard maker's fault would have been more believable.

Unfortunately they didn't kept their mouths shut and now if you have a brain it's hard to believe they weren't at fault for this.
Posted on Reply
#170
AleXXX666
Dr. DroI mean, "14th Gen" chips don't have as much as a new stepping. It's just a repackage and re-release of their existing chips to satisfy shareholders, because they didn't have Arrow Lake available on time and Meteor Lake isn't suitable to replace Raptor as a high-performance desktop processor (maxing out at around i5 level). Still, Intel did manage to improve yield to the point that the 14900K is now a mass-produced 13900KS, and the 14900KS pushed that even further, even if it's by only a few MHz or so. I didn't believe they'd be able to pull a 14900KS at all, and they indeed haven't with the "6.5 GHz" claims from early rumors, still, 6.2 with 5.9 all-core is not too bad. It's 300 MHz up from the 13900KS's average, which means they're excellent bins.



The difference is that Bulldozer sucked, and Raptor Lake doesn't.



Everyone does this. Remember how Zen 4 launched with completely broken memory training (it'd take minutes to boot), how it had a clock ceiling on the memory that wasn't because of hardware but because AGESA was flat out broken, how the Ryzen chips actually caught fire because the AGESA-level current control wasn't functional, etc.

Basically: if you want a stable platform nowadays, just don't buy latest-generation gear. "Settle" for like, a Zen 3 or Rocket Lake platform with a fully updated BIOS.



The 320 W setting is considered to be an "Extreme Power Profile" that is exclusive to the Core i9-12900KS, 13900KS and 14900KS SKUs, iirc. Otherwise you're correct.
but, 12 gen is rock solid, made pretty lot of them, however, "cheap way" - ddr4, lol, as those were "gaming" rigs, and ddr5 is "must have" in productivity tasks more...
Alder is hell faster than "rocket (lol) lake"
Posted on Reply
#171
dgianstefani
TPU Proofreader
Vya DomusAMD also has the benefit of typically not lying
Amazing, thanks :laugh:
AleXXX666but, 12 gen is rock solid, made pretty lot of them, however, "cheap way" - ddr4, lol, as those were "gaming" rigs, and ddr5 is "must have" in productivity tasks more...
Alder is hell faster than "rocket (lol) lake"
What?
Posted on Reply
#172
Daven
dgianstefaniMy dude, AMD has the same issues. Remember Meltdown?

AMD just has the benefit of underdog favoritism and a bunch of people who will defend them no matter what, disregarding the fact they are a multinational corporation who should and could do better.

It's almost a meme at this point how bad the first ~year of AGESA is for a new platform.
I guess we will get the last of the big three (AMD, Nvidia, Intel) status check with AMD earnings report this evening. Since their last quarterly report, Nvidia is up almost 100%, AMD is flat and Intel is down almost 50%. I feel like this matches how good each company’s products are faring.
Posted on Reply
#173
dgianstefani
TPU Proofreader
DavenI guess we will get the last of the big three (AMD, Nvidia, Intel) status check with AMD earnings report this evening. Since their last quarterly report, Nvidia is up almost 100%, AMD is flat and Intel is down almost 50%. I feel like this matches how good each company’s products are faring.
IDK. Intel is in a transition period with first gen GPUs (which are surprisingly good) and revamping their foundry approach plus going disaggregated. Would not be surprised to see them coming out swinging this year.
Posted on Reply
#174
RJARRRPCGP
dgianstefaniI boot in about 45 seconds. Once you get to Windows it's very fast, it's the BIOS POST that takes ages.
LOL, that's the meme of my socket 775 and socket 1366 systems the BIOS takes ages, but performs well in Windows.
Posted on Reply
#175
Solaris17
Super Dainty Moderator
Tek-CheckPosting specs has nothing to do with tightening up, as @Crackong explained in the post #157
You think they would give us the user what they do on ark and NOT give the oem the values when they give they idk tell them how to use the CPUs in a new chipset? Thats a new level of denial. AMD board partners do the same shit. Did AMD not give them the specs when ASUS was burning holes in the core?

Get real.
Posted on Reply
Add your own comment
Jun 14th, 2024 13:38 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts