• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel CEO Confirms SMT To Return to Future CPUs

Status
Not open for further replies.
What I don't get is that when Lion Cove came out, Intel said hyperthreading was out for client use but would still be present in servers. So when Intel says that future server CPUs will have hyperthreading, what is the news?
I thought they said it was going to be coming back to client in the future also. They just had other, better, things to spend the transistor budget on at this point in time.
 
Removing AVX to add in more cores is like saying you want to remove your car's turbo so you can fit a larger intake. You'd be cutting yourself off at the knees, AVX make a huge difference in any workload built for it.

You can even test this yourself, if you have a coffee lake machine, go find a coffee lake pentium. They did not have AVX, then compare it to a core i whatever with its core and clock rate adjusted to match. The performance difference is shocking. Even back in high school it was noticeable, and that was 14+ years ago on ivy bridge.
Well i've seen 13700ks beat 9700x easily in workloads that are AVX accelerated. If we assume that in general 8 P cores are equal to the 8 zen 5 cores then it turns out that 8ecores are faster than the AVX acceleration, right?
 
Meanwhile, AMD is about to release CPUs with 4-way SMT enabled.

 
No, that is not correct current information. That could be right in the time when CPUs had 4 cores. That is long gone.

"AMD engineers estimate that implementing SMT takes less than
5% of the core area in the latest AMD “Zen 4” and “Zen 5” cores. This includes all the necessary logic to allow two threads to share
the core’s resources."

<5% is correct to this day even with AMD's more advanced SMT implementation as AMD themselves have stated: https://www.amd.com/content/dam/amd...hite-papers/amd-epyc-smt-technology-brief.pdf

Here is the Intel source (wikipedia): https://en.wikipedia.org/wiki/Hyper-threading

Please share your source that states otherwise. Surely you didn't rush to call someone else wrong without having any basis for such a statement.

For example, 14900K has 8 P cores, 16 E cores and then 8 P cores - second threads. It does not want to use second threads of the P cores, because they perform the worst. For any loads with 24 or less threads HT is not utilised at all, because it would DECREASE the performance of the CPU. HT only helps in specific case of extremely multithreaded applications. Normal consumers do not use such applications and if they do, 24 threads of 24 cores offer very high performance for any normal person.

NORMAL PEOPLE DO NOT NEED HYPERTHREADING ANYMORE, PERIOD.

I don't see how this is an example of die space usage of HT, you seem to have jumped from one topic to another or perhaps are relying to the wrong person or quoting the wrong thing (as you only quoted my die space figure). Mind you, this isn't an example anyways, it's you explaining how you think the 14900K works.

I should point out, a CPU designed around replacing HT with e-cores isn't a good example that we don't need HT / SMT. That's what it was designed for.

Clearly it hasn't been an effective approach either, as Intel's better getting it's rear whooped in server and regular customers aren't attracted to the latency penalty of the e-cores and the scheduling issues they bring. It'd be an entirely different conversation if the design were actually successful but it isnt'.

Sure regular customers could do just fine without hyper-threading if they choose an Intel product but the problem is they aren't comparing well to AMD options that do have it. That Intel is switching back says all you need to know about that. It's akin to fabricating a nail clipper that does something other than press two wedges together just to prove that it could be done other ways.

Could there ever be a day when HT is no longer needed for customers? Maybe, although if HT is a better approach for Desktop / Laptop than having big and little cores I don't see why you would't just give the customer less cores with HT / SMT.
 
regular customers aren't attracted to the latency penalty of the e-cores and the scheduling issues they bring.
What the actual hell are you talking about? You do realize that the fastest non 3d chip in gaming (the foremost latency sensitive workload), CPUs with ecores actually dominate? Actually the fastest gaming chip not accounting for 3d cache chips is the cpu that has the most amount of Ecores - the 14900k. What the heck man? Seriously....:roll:

EG1. I just realized that if you actually exclude 3d chips there is not a single non E core cpu in the top 5 gaming chips. That's coming from TPU's 720p benchmark.
 
I'm afraid you've got it a bit confused... HT is present on Intel processors up to the 14th generation. It's not present on the 15th (ULTRA)...
And this is being shown on the front page. Yikes. Perhaps he was thinking like... meteor lake? Or maybe he was considering RL and RL refresh the same gen, which is kinda fair I guess, since they are the same silicon, but if we are to do that, then we might have to consider removing some other intel gens too and that would get messy and really confusing.

But yeah its probably just a mistake.


Actually, the author probably meant to say this. In the 10th and 11th gen processors, all cores had SMT/HT. Then from the 12th thru 14th gens, only the P cores had it and not the E cores. Then no cores had it afterwards. Hence the "shift away" from SMT/HT.

The only problem is, historically speaking, the 10th and 11th gens having SMT/HT throughout the stack was in fact the outlier. In 9th gen and prior, only the i7s got it, along with dual core i3/i5s (the office CPUs). Gaming i5s were SOL until the 10th gen, but at least it was nice to have full cores. Perhaps the author saw Intel's new approach as only giving SMT/HT to some cores on the CPU from 12th-14th gen as opposed to all/none in prior gens as the "shift away".



Also, to the author, the following line
It's unclear whether the change to reintroduce SMT, aka as "Hyper-Threading"

the "aka" means "also known as" so the "as" after the acronym is redundant...
 
What the actual hell are you talking about? You do realize that the fastest non 3d chip in gaming (the foremost latency sensitive workload), CPUs with ecores actually dominate? Actually the fastest gaming chip not accounting for 3d cache chips is the cpu that has the most amount of Ecores - the 14900k. What the heck man? Seriously....:roll:

EG1. I just realized that if you actually exclude 3d chips there is not a single non E core cpu in the top 5 gaming chips. That's coming from TPU's 720p benchmark.

That is because of the ring bus and Intel's micro-architecture and software optimisation, and mostly because AMD's CPUs suck big time - very slow DDR access, high latencies, etc.
 
That is because of the ring bus and Intel's micro-architecture and software optimisation, and mostly because AMD's CPUs suck big time - very slow DDR access, high latencies, etc.
Which goes against the argument that "ecores have high latency yadayada".
 
What the actual hell are you talking about? You do realize that the fastest non 3d chip in gaming (the foremost latency sensitive workload), CPUs with ecores actually dominate? Actually the fastest gaming chip not accounting for 3d cache chips is the cpu that has the most amount of Ecores - the 14900k. What the heck man? Seriously....:roll:

EG1. I just realized that if you actually exclude 3d chips there is not a single non E core cpu in the top 5 gaming chips. That's coming from TPU's 720p benchmark.
E cores are nice, but TPU's charts disagree with your statement. TPU's last CPU review was of the 9950X3D. Take a look at the 720p gaming charts.

1753477318203.png


As for SMT, a lot of the commentors are conflating it with coarse-grain multithreading. David Kanter's deep dive into Poulson, the last microarchitecture based on Itanium, has a nice diagram showing the difference between various forms of hardware multithreading. A processor implementing SMT doesn't rely on L3 misses or other stalls to issue instructions from the other thread; both threads share the CPU in all cycles.


1753475411737.png
 
E cores are nice, but TPU's charts disagree with your statement. TPU's last CPU review was of the 9950X3D. Take a look at the 720p gaming charts.

View attachment 409315
I was using the chart from the original 9950x review

relative-performance-games-1280-720.png

But the point still stands, CPUs with ecores are sitting at the top while cpus without them are sitting below - the 9950x is the only exception in your chart.
 
What I don't get is that when Lion Cove came out, Intel said hyperthreading was out for client use but would still be present in servers. So when Intel says that future server CPUs will have hyperthreading, what is the news?
There was a lot of rumblings that due to the SMT insecurities that were showing up almost weekly at a time Intel were going to remove it from their whole product stack starting with Client and then rolling into Server in a generation or two after.

As for SMT, a lot of the commentors are conflating it with coarse-grain multithreading. David Kanter's deep dive into Poulson, the last microarchitecture based on Itanium, has a nice diagram showing the difference between various forms of hardware multithreading. A processor implementing SMT doesn't rely on L3 misses or other stalls to issue instructions from the other thread; both threads share the CPU in all cycles.
And I believe every Intel CPU since Itanium has used SMT and so has AMD since Zen. Bulldozer from AMD used a different design with a pseduo CMP by it doubling "commonly" used aspects and sharing core/uncommonly used parts. A gamble that blew up massively for AMD from both a performance AND marketing perspective.

Intel Atom used a really watered down version of SMT back in the day to get to even basic performance levels but it was a cursed design in so many ways.
 
Last edited:
I was using the chart from the original 9950x review

relative-performance-games-1280-720.png

But the point still stands, CPUs with ecores are sitting at the top while cpus without them are sitting below - the 9950x is the only exception in your chart.
I don't think there's sufficient information in the newest chart for that rather strong claim. In the 285K review, the 9950X was tied with the 9700X in the 720p gaming charts.
 
No, that is not correct current information. That could be right in the time when CPUs had 4 cores. That is long gone.

For example, 14900K has 8 P cores, 16 E cores and then 8 P cores - second threads. It does not want to use second threads of the P cores, because they perform the worst. For any loads with 24 or less threads HT is not utilised at all, because it would DECREASE the performance of the CPU. HT only helps in specific case of extremely multithreaded applications. Normal consumers do not use such applications and if they do, 24 threads of 24 cores offer very high performance for any normal person.

NORMAL PEOPLE DO NOT NEED HYPERTHREADING ANYMORE, PERIOD.
This!

************
As for SMT, I wouldn't be surprised if in the next generation AMD also removes SMT (for consumer market*).
 
No, that is not correct current information. That could be right in the time when CPUs had 4 cores. That is long gone.

For example, 14900K has 8 P cores, 16 E cores and then 8 P cores - second threads. It does not want to use second threads of the P cores, because they perform the worst. For any loads with 24 or less threads HT is not utilised at all, because it would DECREASE the performance of the CPU. HT only helps in specific case of extremely multithreaded applications. Normal consumers do not use such applications and if they do, 24 threads of 24 cores offer very high performance for any normal person.

NORMAL PEOPLE DO NOT NEED HYPERTHREADING ANYMORE, PERIOD.
Depends entirely what your core strategy is; as numerous ppl tried to point out. If your baseline is a 24 thread cpu with E cores, then HT wont be adding much other than an increased die size for situational performance; but if you base the idea of HT/SMT on a cache heavy lower core count CPU, like the X3Ds that run most efficiently on a single CCX, you really do want HT, and you end up with a much leaner, smaller and still performant die that still has not 24, but 16 thread capability with full instruction set access for them all. Whether that is a better idea or not is use case dependant only to a small degree; on MSDT though you want great allrounders at a low price IMHO.

And as we can see by the perf/W of those CPUs relative to their absolute performance... that has already proven to be a winning strategy for a vast number of, especially, sequential loads, gaming loads and most other tasks that are not heavily parralelized. A consumer CPU that is cost effective AND runs at low power doing so is definitely hellped by HT. Even Intel has proven this with decades worth of highly performant quadcores in desktop and laptop segments. That reality has not changed and will never change. Quads are still performant CPUs. Core count is like RAM. You need enough, and having more wont do jack shit other than create a nice marketing line and inflates die size/cost.
 
Last edited:
Actually, the author probably meant to say this. In the 10th and 11th gen processors, all cores had SMT/HT. Then from the 12th thru 14th gens, only the P cores had it and not the E cores. Then no cores had it afterwards. Hence the "shift away" from SMT/HT.

The only problem is, historically speaking, the 10th and 11th gens having SMT/HT throughout the stack was in fact the outlier. In 9th gen and prior, only the i7s got it, along with dual core i3/i5s (the office CPUs). Gaming i5s were SOL until the 10th gen, but at least it was nice to have full cores. Perhaps the author saw Intel's new approach as only giving SMT/HT to some cores on the CPU from 12th-14th gen as opposed to all/none in prior gens as the "shift away".
It was just a mistake, I pmed him and its now fixed.

Also, to the author, the following line
It's unclear whether the change to reintroduce SMT, aka as "Hyper-Threading"

the "aka" means "also known as" so the "as" after the acronym is redundant...

Good point.
 
Last edited:
What the actual hell are you talking about? You do realize that the fastest non 3d chip in gaming (the foremost latency sensitive workload), CPUs with ecores actually dominate? Actually the fastest gaming chip not accounting for 3d cache chips is the cpu that has the most amount of Ecores - the 14900k. What the heck man? Seriously....:roll:

EG1. I just realized that if you actually exclude 3d chips there is not a single non E core cpu in the top 5 gaming chips. That's coming from TPU's 720p benchmark.
The question is though what the point is of running your games at 300 fps. The 2% of consumers that want that are not gonna save Intel's ass ;)
 
And as we can see by the perf/W of those CPUs relative to their absolute performance... that has already proven to be a winning strategy for a vast number of, especially, sequential loads, gaming loads and most other tasks that are not heavily parralelized. A consumer CPU that is cost effective AND runs at low power doing so is definitely hellped by HT. Even Intel has proven this with decades worth of highly performant quadcores in desktop and laptop segments. That reality has not changed and will never change. Quads are still performant CPUs. Core count is like RAM. You need enough, and having more wont do jack shit other than create a nice marketing line and inflates die size/cost.
Yes, HT helps perf/W, but it's important for server or mobile segments. On desktops 5-10 watts is next to nothing. Transferring data from one core to another also needs power and adds latency, and we have more and more cores/threads and that adds more complexity and software that needs fast data suffers.

So HT/SMT needs to get out of the consumer segment, and we see that happening.
 
Yes, HT helps perf/W, but it's important for server or mobile segments. On desktops 5-10 watts is next to nothing. Transferring data from one core to another also needs power and adds latency, and we have more and more cores/threads and that adds more complexity and software that needs fast data suffers.

So HT/SMT needs to get out of the consumer segment, and we see that happening.
Intel has a major power draw issue they combat with E cores though. The main reason HT was exit was the vulnerability issues. Not latency and not a core count deficit. Consumers do not generally run a lot of stuff concurrently that lower core count CPUs have trouble handling.
 
Intel has a major power draw issue they combat with E cores though. The main reason HT was exit was the vulnerability issues. Not latency and not a core count deficit.
Yes, and with Thread Director it shows that they save more power by simply not moving data from one core to another.
 
Yes, and with Thread Director it shows that they save more power by simply not moving data from one core to another.
Which is again thinking from the baseline of actually pushing 24 real cores forward. See previous posts. In the end its an economical choice, not a performance issue.
 
Which is again thinking from the baseline of actually pushing 24 real cores forward. See previous posts.
HT is just an old solution for old problems.
 
HT is just an old solution for old problems.
Because Intel is problem free now without it? You are missing a key aspect here. 'Its the economy, stupid'...
 
Intel has a major power draw issue they combat with E cores though.
No, they really dont have a major power draw issue. Not even a minor one. I dont get this misinformation stuff, really. Intel has consistently for the past decade or more has the cpus with the lowest amount of power draw in the market, its called the t lineup.

You amongst other people are taking K and KS cpus that are MEANT to run balls to the wall and then somehow conclude that intel has a power draw issue, lol. Which isnt even true for k chips, the 285k is drawing the same amount of power as the 9950x in mt workloads and much less in non mt workloads...
 
Status
Not open for further replies.
Back
Top