• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm

It would something radical from Intel to go with them for my next build. My current Ryzen build is almost 4 years old and due for an upgrade. It runs cool and quiet, had a couple issues with the first 6 months of Windows 11 release, but who didn't at that time? After that was sorted everyhting is back to running flawless. Considering a Ryzen 9700X for my next build.

Intel needs a 3-5x performance per watt increase for me to go back (for real, just look at 7800X3D benchmarks right here at TPU, Intel is appalling in efficiency). A new architecture and moving away from their ancient lithography to Intel 20A might might do it.


What that chart doesn't include though is E cores disabled. If you look at the 7800X3D is indicative of poor development progress on cores and threading we're stuck in due to consoles still being limited to 8 cores and 16 threads. Disable the E cores and suddenly the 14700K is a better 14600K with two more P cores with better binning. The 7800X3D and other 8 core 16t CPU's are in the sweet spot of what developers are targeting right now with current consoles on the market. Expecting that to simply remain the same indefinitely is fools gold though. It also has less need for better quality DDR5 memory with that slab of stacked cache, but still can benefit from it just not as greatly as Intel chips will with a smaller cache size and stronger IMC.

I don't really get why w1zzard tested that at 1080p though while in other cases 720p is used to better represent a CPU bottleneck. I don't think it would help things particularly, but it probably would push CPU core usage and thread usage higher in some scenario's. Anyways we need to transition away from 8c/16t consoles before we see forward progress beyond that become standard. You can find examples where developers have targeted better hardware resource, but it won't become common until we see a shift at the largest audience developers target which is the console market.

This really isn't about which is better and why for which purpose use case under which testing scenario example however. This is about Intel making a bad decision or blunder and yes and/or maybe is kind of what we've gathered on the matter to this point.
 
Intel were competing though, AMD had to resort to slapping some cache on the top to compete. Without the 3Dvcache it's AMD who would be behind. In a straight non Vcahce contest regardless of power Intel is better in everything.

What is AMD better at? AVX-512, no e-Cores or problems shifting loads, price on some parts, power consumption which equals less heat and power, and of course it actually does have a vcache part that is more performant in gaming, Zen5 also has more PCIe lanes than alder/raptor-lakes.

Its not really clear how intel is "better in everything", what were you referring to specifically?

Yeah, but at what cost? In a lot of cases, our electricity bills and potential future silicon degradation.
yeah its a serious issue that can not be discounted, power consumption and heat on these things is out of control
 
How is it not their own tech? They are the ones to have the idea to add cache on top of the cpu and designed a working model of that idea, then used TSMC fabrication technology to put that into practice. Just like Intel is doing with foveros and emib except intel is vertically integrated with their own fabs so they have to design both parts of the solution. If AMD did nothing and just used someone else's tech how come they're the only ones doing it?

If you want to use that stupid argument, well neither of them does anything, they're all just using what ASML makes possible with their machines, it's a ridiculous idea.

yeah it was AMD's idea sure
Untitled2.jpg

https://www.techinsights.com/blog/amd-ships-3d-v-cache-processors

The company used two TSMC innovations to create it.
Untitled3.jpg


https://www.techpowerup.com/review/amd-ryzen-7-5800x3d/2.html

Without TSMC it would not exist.
 
Without TSMC it would not exist.
That's the same BS argument that Zen would not be successful without TSMC. Then i need to remind people that Zen actually started from GlobalFoundries 14nm process before transitioning to TSMC with Zen 2 (3000 series). Sure it made it better because it was 7nm vs 14nm first and foremost but the groundwork had already been laid.

Also slapping a heap of cache on top of the die is not a guaranteed success. HUB has videos exploring various Intel CPU's with varying amount of cache and while bigger=better helps it's not as universal for Intel's the architecture as higher clock speeds.

Also 3D V-Cache is not an AMD exclusive technology. Other TSMC customers can also use it, including Intel.
Die-thinning and TSV's are also not purely TSMC's innovation as TSV's had been used in HBM memory before that by Korean memory makers.

Both AMD and Nvidia (i believe Intel too) are also using another TSMC technology that's in the news: CoWoS.
I dont see you downplaying them for some reason - just AMD.
 
Last edited by a moderator:
That's the same BS argument that Zen would not be successful without TSMC. Then i need to remind people that Zen actually started from GlobalFoundries 14nm process before transitioning to TSMC with Zen 2 (3000 series). Sure it made it better because it was 7nm vs 14nm first and foremost but the groundwork had already been laid.

Also slapping a heap of cache on top of the die is not a guaranteed success. HUB has videos exploring various Intel CPU's with varying amount of cache and while bigger=better helps it's not as universal for Intel's the architecture as higher clock speeds.

Also 3D V-Cache is not an AMD exclusive technology. Other TSMC customers can also use it, including Intel.
Die-thinning and TSV's are also not purely TSMC's innovation as TSV's had been used in HBM memory before that by Korean memory makers.

Both AMD and Nvidia (i believe Intel too) are also using another TSMC technology that's in the news: CoWoS.
I dont see you downplaying them for some reason - just AMD.

But Zen would NOT be successful without TSMC. GlobalFoundries does not have a modern manufacturing process suitable to build these processors on, and more cache does not necessarily mean better, in fact, there are several scenarios where the Ryzen X3D chips regress in comparison to the standard models. This occurs because 3D V-Cache incurs a cycle penalty and data takes longer to be processed, which means the standard model is better if the data set fits within its capacity. Also, 3D V-Cache is an AMD technology, TSMC is just a foundry and builds chips to the specification of their customers.

Intel's 3D technology is called Foveros, which was first seen in the Lakefield processor. It can be used to integrate every component in an SoC. Lakefield was very much some sort of proof-of-concept that made to the market (released as a mobile Core i5 in very limited quantities for one certain Samsung laptop) and as an example, featured one P-core, four E-cores (both of the first-generation kind, similar to seen in Rocket Lake), GPU and DRAM fully integrated on-die. It was some sort of Alder Lake prototype, in a certain way.


CoWoS stands for Chip on Wafer on Substrate, and it's got nothing to do with 3D stacking technology, it's similar to Intel's EMIB, it's a 2.5D system.


1718766677601.png


The breakthrough will be combining this 2.5D packaging with 3D stacked dies to maximize density.

According to a report over at Techspot.com, Intel still doesn't know what's going on with the Core i9. My thoughts are that this is simply of symptom of Intel pushing a 15-year-old microarchitecture way past the breaking point.

At this point, I think Intel needs to recall every single last Core i9 ever sold and to issue refunds for selling what is a defective product.

Intel still doesn't know what is causing its i9 desktop chips to crash | TechSpot

Raptorlake is Nehalem rehashed 15 times over every year in the same way Zen 4 is a direct descendant of the K5, yes. :kookoo:

I wasn't affected, but I can easily see where it's all going wrong: bad motherboards, bad real-world operating conditions, and underlying microcode bugs... no wonder it's the i9's that have a problem and i7's with more down to earth clocks and no fancy thermal boost are largely immune.
 
Last edited:
But Zen would NOT be successful without TSMC.
Zen was successful already on 14nm. 7nm by TSMC just made it better.
GlobalFoundries does not have a modern manufacturing process suitable to build these processors on,
We dont know if GF would be competitive had they not axed their sub 10nm plans.
and more cache does not necessarily mean better, in fact, there are several scenarios where the Ryzen X3D chips regress in comparison to the standard models.
Mostly clock speeds.
This occurs because 3D V-Cache incurs a cycle penalty and data takes longer to be processed, which means the standard model is better if the data set fits within its capacity.
This penalty is very small. Standard models excel in tasks that benefit from raw clock speed.
CoWoS stands for Chip on Wafer on Substrate, and it's got nothing to do with 3D stacking technology, it's similar to Intel's EMIB, it's a 2.5D system.
I was not comparing the two. I was giving an example of another technology that all three companies use.
 
But Zen would NOT be successful without TSMC.
Does it matter, though?

Nvidia wouldn't be successful without TSMC and Samsung, either. So what?
 
Does it matter, though?

Nvidia wouldn't be successful without TSMC and Samsung, either. So what?

A is true because B is true; so that means B is true because A is true :kookoo:

I do not see the correlation with other customers' portfolio and the fact that... you couldn't build a modern Zen CPU on Globalfoundries' latest node
 
A is true because B is true; so that means B is true because A is true :kookoo:

I do not see the correlation with other customers' portfolio and the fact that... you couldn't build a modern Zen CPU on Globalfoundries' latest node
AMD relies on TSMC for their CPUs, which is bad. Nvidia relies on TSMC for their GPUs, which is good. Am I the only one seeing a massive gaping contradiction here? :kookoo:
 
This is the topic:

Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm​


Please stick to it and stop the pointless tribal bickering.
 
I would like to finally see example of somebody getting instability issues after having everything set correctly from the start. Maybe even not that hardcore as using 125W PL1, but having all or even majority settings from Intel's blue tablet like this thing shows and turned off mobo's inventions like e.g. multi core enhancement. Boards are known for stupid "default" ideas for long and to the point like you can't trust them, checking CPU behaviour being from the first things to do after building a computer.
 
It's not just the power nor temperature, no CPU should ever be allowed to boost at 1,45 volts. my comfort limit for 7Nm would be 1,35V and 1,25 for 2Nm and onwards.
 
For once my old school overclocking of x freq x voltage is better :). Never have to worry about the boosting problems.
 
I'm getting the impression that ICCMAX defaults and/or recommendations is one of the bigger instability faults. Intel really should've included ICCMAX in a easy to find location on it's product page for it's chip SKU's instead of buried in a obscure PDF file somewhere that you can maybe find on the dark web region of it's website if you're a internet archive website archeologist. Intel should know better than that. It's a huge oversight on their behalf to not do so and that will probably be argued against them in any class action lawsuits that this whole chip broken fiasco.

If they can figure it out and come up with a real solution and w/o it arbitrarily impacting performance in a meaningful way that would be ideal and nice, but I have my reservations about that actually happening. It seems a lot like another spectre meltdown situation of sorts. That said they got away with that mostly unscathed. I could still cope with that honestly, but I got a great deal on my CPU if I'd paid thru the nose for a 14900K I'd wouldn't be too thrilled by it even if it is just a minor scaling back of relative performance that's already very abundant.
 
I'm starting to get the feeling that buyers of high-end CPUs or GPUs need to be prepared for disaster these days. RTX 3090s burning down with that Amazon game I can't remember, cooler issues with AMD-made 7900 XTXes, ASUS motherboards frying X3D CPUs, and then this malarkey with i9 stability... This is what you get in a world when every single soul and every company wants to be 1% ahead in everything all the time, I guess.
 
That whole new PSU connector fiasco as well. One of my M.2's also mysterious cooked itself and label looked melted. Either that M.2 heat spreader label was conductive and shorted itself or something else went wrong it to do with the PCIE 5.0 slot perhaps though my older gen 3.0 M.2 in that slot's been just fine. I think when I bought it the label was dodgy and I installed it anyway and it worked fine initially then fried after a month or two of some heating and cooling cycles. I could've sworn one looked a bit funky and almost returned it immediately, but didn't and decided to just try it anyway. Certainly won't be taking that chance again in the future. It be worse though at least it wasn't a catastrophic PSU failure.
 
Question for you all. So the 14th gen Core i9 is not affected by this mess up?
It's in the title:

Raptor Lake Stability Issues​

14th gen is Raptor Lake (as well as 13th gen).
 
Yes, but what I read in other sites is the 13th + 14th Gen i5 and i7. But no where do they bring up or mention i9. Sorry but am old.
It's in the title:

Raptor Lake Stability Issues​

14th gen is Raptor Lake (as well as 13th gen).
 

Wendell has interesting analysis using the telemetry data from two game studios and feedback from data center companies and system integrators. Not only we see increased number of failures for 13900K and 14900K systems not only on consumer side but also on the server side, where they're often used for hosting game servers that make use of high single core performance at stock settings using the W680 boards.

It reaches a point where game server hosting companies will charge you extra $1000 of support if you opt for Intel:
1720689729323.png


1720689803599.png
 
Last edited:
Yes, but what I read in other sites is the 13th + 14th Gen i5 and i7. But no where do they bring up or mention i9. Sorry but am old.
Other way round, TVB is on i9 chips only.
 
Other way round, TVB is on i9 chips only.

They may have meant with stability issues which TVB would probably just exacerbate the problem further on the i9. Right now we haven't gotten a clear indication as to what the root of the problem is. One of the things I've speculated is maybe the socket bending issues is part of the problem. That would absolutely be a larger issue with Datacenter Service Providers purchasing pre-made's since they wouldn't normally being installing anti-bending brackets. In fact Wendell could probably try to look at some cross comparison analysis between what DataCenter Service Providers are seeing versus like Steam or a larger gaming company to look at.

I would think the case of gaming at least you'd see a stronger likelihood of at least some of them using anti-bending brackets more so than with DataCenter so then digging further if the incidents of problem actually higher it might be a good indicator that the socket bending issue is a underlying culprit possibly. I'd say especially so given Gamer's are more likely to also overclock and push memory clock speeds and things higher so you'd actually expect instability to be inherently worse by a decent amount just based on that fact alone.

On the other hand if the data is more the opposite and much higher with like data around gaming and telemetry of that it might point more towards memory and/or ring bus perhaps possibly even the cache and just IMC in general and pushed far beyond general Intel recommendations around memory support. That most gamer's are pretty guilty of doing.

The fact that we still don't have a legitimate answer yet is crazy though. I mean this issues impacted people since 13th gen. How have they not pin pointed a cause by now? It's understandable that some finger pointing has happened at MB maker's with questionable bios decisions honestly and they fully deserve that criticism in light of a situation like this especially. It's a wake up call not do stupid questionable things with default settings. Anyways yeah is what it is, but insane that we still have no answers though we've got some insight into the widespread severity of the problem.
 
Back
Top