Friday, June 14th 2024

Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm

Intel has identified the root cause for stability issues being observed with certain high-end 13th- and 14th Gen Core "Raptor Lake" processor models, which were causing games and other compute-intensive applications to randomly crash. When the issues were first identified, Intel recommended a workaround that would reduce core-voltages and restrict the boost headroom of these processors, which would end up with reduced performance. The company has apparently discovered the root cause of the problem, as Igor's Lab learned from confidential documents.

The documents say that Intel isolated the problem to a faulty value in the microcode's end of the eTVB (enhanced thermal velocity boost) algorithm. "Root cause is an incorrect value in a microcode algorithm associated with the eTVB feature. Implication Increased frequency and corresponding voltage at high temperature may reduce processor reliability. Observed Found internally," the document says, mentioning "Raptor Lake-S" (13th Gen) and "Raptor Lake Refresh-S" (14th Gen) as the affected products.
The company goes on to elaborate on the issue in its Failure Analysis (FA) document:
Failure Analysis (FA) of 13th and 14th Generation K SKU processors indicates a shift in minimum operating voltage on affected processors resulting from cumulative exposure to elevated core voltages. Intel analysis has determined a confirmed contributing factor for this issue is elevated voltage input to the processor due to previous BIOS settings which allow the processor to operate at turbo frequencies and voltages even while the processor is at a high temperature. Previous generations of Intel K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.
Identifying the root cause of the problem isn't the only good news, Intel also has a new microcode ready for 13th Gen and 14th Gen Core processors (version: 0x125), for motherboard manufacturers and PC OEMs to encapsulate into UEFI firmware updates. This new microcode corrects the issue, which should restore stability of these processors at their normal performance. Be on the lookout for UEFI firmware (BIOS) updates from your motherboard vendor or prebuilt OEM.
Source: Igor's Lab
Add your own comment

107 Comments on Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm

#2
trsttte
So it isn't motheboards after all?
Posted on Reply
#3
Event Horizon
No relief for CPUs that are already degraded from high voltage at high temps. Better late than never though.
Posted on Reply
#4
Darmok N Jalad
Previous generations of Intel K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.
Terminal Velocity Boost.
Posted on Reply
#5
SRB151
TPU left out the part on a few other sites, which is that the problem will cause processor degradation depending on how long it was exposed to the problem. No one knows if it will fall under warranty or not.
Posted on Reply
#6
N/A
Selecting XMP profile raises CPU OC flag and voids warranty. But other than that
Restore stability at their nominal performance, sure. Lower tolerances and you get same performance. Make it make sense.
Posted on Reply
#7
Nanochip
So a microcode issue from the vendor was the cause of the stability woes. At least the power of the internet and social media put pressure on Intel, and it worked to find a solution…

But the key point is raptor lake was factory-pushed further to the limit (of stability) than previous Lakes.
Posted on Reply
#10
ty_ger
Intel states, "While this issue is potentially contributing to instability, it is not the root cause."

And Intel states that it is "still investigating" looking for a root cause.
Posted on Reply
#12
user556
ty_gerIntel states, "While this issue is potentially contributing to instability, it is not the root cause."
Oops, I didn't read to the end. Fair call. :)
Posted on Reply
#13
Tomorrow
Will Intel honor warranty for 13th gen owners whose CPU got degraded because of this?
Posted on Reply
#14
mb194dc
SRB151TPU left out the part on a few other sites, which is that the problem will cause processor degradation depending on how long it was exposed to the problem. No one knows if it will fall under warranty or not.
If it's a manufacturing fault it should do, or class action lawsuit from anyone with a damaged processor will be incoming.

The bottom line is Intel pushed these chips too hard due to the fundamental design not matching AMD chips.
Posted on Reply
#16
InVasMani
While it's Intel fault it certainly wasn't intentional to write faulty micro code. That's just a innocent **** up by someone that went unnoticed.
Posted on Reply
#17
john_
This is the value in the microcode that was creating the problem

"Win_Benchmarks_At_Any_Cost=YES"

And this is the fix

"Win_Benchmarks_At_Any_Cost=NO"
Posted on Reply
#18
AusWolf
InVasManiWhile it's Intel fault it certainly wasn't intentional to write faulty micro code. That's just a innocent **** up by someone that went unnoticed.
...and was blamed on motherboard makers before they finally noticed it.
Posted on Reply
#19
InVasMani
If they weren't doing dubious question stuff in regards to MB defaults the finger wouldn't have been pointed their direction in the first place. In any case it was a problem regardless. It's a issue for AMD and Intel and MB makers shouldn't be doing that type of thing with defaults. Nearly everyone is in agreement with that who has even a shred of sense and integrity. It's a bad judgement call by MB maker's to eek out more performance to win benchmarks at any costs as john_ satirically puts it on the microcode. It probably saved a number of chips from potentially dying from degradation by MB makers pushing new bios with proper bios defaults that weren't dodgy AF.
Posted on Reply
#20
Solaris17
Super Dainty Moderator
Darmok N JaladTerminal Velocity Boost.
Listen bro, you can push 40psi of boost atleast once. Dont let "big piston" tell you what you can or cant do.
Posted on Reply
#21
chrcoluk
Interesting interpretation on the news article. My interpretation is running bios at spec isnt a workaround but a fix, but this microcode fix allows the out of spec configuration to run stable again (or at least makes it more likely). I also disagree with the article that the old performance levels are "normal" for the product. Not only were they out of spec but they were using a faulty microcode that enabled TVB too frequently. I would like to see new reviews on this new microcode. :)

I expect now the push for better bios defaults will suddenly vanish, as Intel will want this out the news asap, and the board vendors will want to continue pushing bios's that default to out of spec. The solution for both of those is to let the bios situation drop.
AusWolf...and was blamed on motherboard makers before they finally noticed it.
They are still not innocent, running the spec does stabilise chips. Just looks like there is 2 triggers to the problem.
Posted on Reply
#22
FoulOnWhite
john_This is the value in the microcode that was creating the problem

"Win_Benchmarks_At_Any_Cost=YES"

And this is the fix

"Win_Benchmarks_At_Any_Cost=NO"
Nice non biased commet there, well done.
Posted on Reply
#23
Denver
Let's get into conspiracy theory mode. I find it quite convenient to release a fix that degrades RPL performance now months before the release of Arrow Lake. Now I understand the +/- 10% margin of error in intel's slides.

Pat is a criminal genius. lol
Posted on Reply
#24
Daven
And some are still saying it’s not Intel’s fault. Sigh…
Posted on Reply
Add your own comment
Oct 15th, 2024 20:40 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts