Hyper-threading (SMT)

Shrek · 2025-07-28T17:51:51+0100

Little confused about Hyper-threading (SMT)

If I understand it, Intel introduced hyper-threading, discontinued it and are now bringing it back.

Macro Device · 2025-07-28T17:56:11+0100

Yes?

They found a major security problem in their former HT variation so they shut it down until they figure it out how to engineer the immune HT version. Also probably more efficient one.

dragontamer5788 · 2025-07-28T19:03:15+0100

Symmetric Multiprocessing existed before Hyperthreads. Hyperthreads is Intels version of SMT.

So the real question is: why the flip flop? Well, before E-cores, Intel thought it was more efficient to make big cores then split them up. Note that 'bigger cores', such as Apples M4 chip, can perform 8 instructions per clock tick. Bigger adds to more parallelism, so it's a huge architectural question of 'how big' vs 'how Small's. (Is 8 core better than 4x bigger cores??)

Hyperthreads and SMT help allow a big core work in both cases. When there's only 1 thread per core, it acts like a big core. When there are 2 threads per core, the cores detect that and automatically act like two smaller cores!! Best of both worlds.

But.... then E-cores were made and can fit 4x E-cores in the space of 1x P-core (which only handled 2 threads). So now Intel thinks E-cores are better because 4x threads (even if each thread is slower) is better than 2x Hyperthreads on one core.

Now we are back to 2-threads per PCore. Maybe Intel discovered something new that made the Hyperthreads on PCores a better option vs their E-cores technology. We will have to see the new arguments, but with all the firings going on at Intel I doubt we will get much info.

agent_x007 · 2025-07-28T20:08:51+0100

HT was introduced in Pentium 4 CPUs to get most out of their back-end (avoiding "wasted" cycles due to pipeline stalls - you have two threads to stall, before execution units go idle).

There were speculation about making reverse hyperthreading design at one point, but this was never realized (due sheer pointlessness of it - why make "fusion of smaller cores" and confuse ALL programs and OS'es running on them, when you just can make a bigger core and make it more efficient - avoiding all software issues along the way).
Note : A "wider" CPU (more execution units, memory fetch etc.), does NOT = faster in executing x86 code.
It CAN be faster, BUT only if actual program code allows it (not all programs scale with core/thread count, and not all programs scale with wider cores - ILP (Instruction level parallelism) vs. TLP (Thread level parallelism).

In the end : X86 hardware is always limited by software you run on it at this point.

Intel "dropped" HT due to their own hubris.
My opinion on it is this :
Windows fails at keeping certain workloads on "preferred" threads/cores Intel designated, so - Intel makes less threads for OS to choose from (easier to select "correct" core to run certain program on), and each thread is a lot closer to the next one capability wise.
It's an attempt at trying to "fix" a problem they themselves created with P-core/E-cores split (and lop sided E-core/P-core ratio).
Personally, I think P-Core/E-Core strategy on Desktop was dead-end since the beginning, but Intel needed to make "a lot corez" CPUs faster (time scale wise), due to Ryzen - so they went with this "brilliant idea".

They want to bring it back, because it's really f*** good at extracting performance from each core without massive transistor and power budget increases.

A Computer Guy · 2025-07-28T20:23:19+0100

Shrek said:
Little confused about Hyper-threading

If I understand it, Intel introduced hyper-threading, discontinued it and are now bringing it back.

I can explain in simple terms... Hyperthreading is like a two person bicycle going down a narrow side walk while each rider is trying to deliver ice cream cones from opposite sides at 40km/h while one side faces the street putting a patron at risk of getting struck by a car while on the other side a patron might step in some dog poo somebody forgot to pick up on the neighbors lawn.

In the new HT I'm not sure if AI is replacing the riders or the patrons but I guess we will have to wait and see.

dragontamer5788 · 2025-07-28T20:54:11+0100

A Computer Guy said:
I can explain in simple terms... Hyperthreading is like a two person bicycle going down a narrow side walk while each rider is trying to deliver ice cream cones from opposite sides at 40km/h while one side faces the street putting a patron at risk of getting struck by a car while on the other side a patron might step in some dog poo somebody forgot to pick up on the neighbors lawn.

In the new HT I'm not sure if AI is replacing the riders or the patrons but I guess we will have to wait and see.

Wtf is this example lol

Hyperthreading (and SMT) is the 2nd lane at the McDonalds drive through. The 2nd lane doesn't get you more cooks or do anything faster, it mostly just gives more work to the cashier (aka: the frontend decoder for a chip).

This is worthwhile because the cashier / frontend decoder is capable of handling more traffic than what one drive through lane can fill. So the decoder is working on a 2nd, parallel lane (which is a 'pretend core' in Hyperthreading / SMT land).

That's it. Hyperthreads/SMT is just noticing that you have plenty of cooks (aka: execution pipelines), and that adding cashiers (ie frontend decoders) is inefficient. It's better (ie: cheaper) to double-work your cashiers than to actually get more cashiers.

OutOfOrder execution is simply serving the guy who asked for a #1 BigMac despite this guy coming afterwards. There's this big request (ie: kids party asking for 100 Chicken Nuggets) and your frontend has decided the slow guy should be served OutOfOrder.

Spectre and Meltdown were critical security bugs relating to speculative OutOfOrder execution (ie: cooking Big Macs ahead of time before anyone ordered them). As it turns out, looking at the BigMacs is enough to steal people's AES keys because fucking math.

I'd describe Spectre as a guy who visits McDonalds 1000 times with a stopwatch, counting how quickly the BigMac order was served. When BigMacs are slower, he knows no one else is ordering BigMacs. Furthermore, he makes a bunch of BigMac orders based off of other bits of info he found (ex: Make a BigMac if this other memory address is an odd number). It takes thousands, or millions, of these stopwatch / timed events to obtain data, but it turns out that 4GHz means 4-billion times a second so a million checks can happen in a millisecond. So it's one of those things that's easier to pull off in practice than it looks.

So now we have a bunch of Internet people who don't know what these performance optimizations are and are pretending unrelated things are the problem, or worried if the 2nd lane at the drive through (aka Hyperthreading) is related to that mysterious security vulnerability relating to cooking BigMacs ahead of time

I mean, maybe. Computers are complicated. But a lot of these things are very unrelated to each other, and it's best to leave the speculation of security problems to the professionals plz.

anonuser57 · 2025-07-28T20:57:54+0100

Intel technically never got rid of HT. It's still in their Server processors just not their desktop and laptop processors. Because HT is just a marketing term it's inclusion doesn't tell you much beyond running two threads on the same core at once. Instructions from 2 threads can be interleaved but you can also have micro-ops from instructions from two different threads executing on the same cycle. It is up to Intel how far they go.

dirtyferret · 2025-07-28T21:05:59+0100

A Computer Guy said:
Hyperthreading is like a two person bicycle going down a narrow side walk while each rider is trying to deliver ice cream cones

dragontamer5788 said:
Hyperthreading (and SMT) is the 2nd lane at the McDonalds drive through

I thought it was more like a three legged dog with worms in a triathlon that featured a unicycle.

A Computer Guy · 2025-07-28T21:08:06+0100

dirtyferret said:
I thought it was more like a three legged dog with worms in a triathlon that featured a unicycle.

You must be confusing that with whatever the FX cpus's were doing. :laugh:

Shrek · 2025-07-29T04:36:21+0100

Missed this article from a few days back

Intel CEO Confirms SMT To Return to Future CPUs

Intel today announced its Q2 results, and it was a bit of a mixed bag, with the earnings largely down and projections showing little overall growth for the foreseeable future. Ahead of this announcement, though, Intel's CEO, Lip Bu Tan, sent an internal memorandum to employees, which has since...

www.techpowerup.com

System Name	CyberPowerPC ET8070
Processor	Intel Core i5-10400F
Motherboard	Gigabyte B460M DS3H AC (used to be B460M DS3H AC-Y1)
Memory	2 x Crucial Ballistix 8GB DDR4-3000, 2 x Micron 16GB DDR4-2666
Video Card(s)	MSI Nvidia GeForce GTX 1660 Super
Storage	Boot: Intel OPTANE SSD P1600X Series 118GB M.2 PCIE; Non-boot: Micron 1100 2TB SATA SSD
Display(s)	Dell P2416D (2560 x 1440)
Power Supply	EVGA 500W1 (modified to have two bridge rectifiers)
Software	Windows 11 Home

System Name	D.L.S.S. (Die Lekker Spoed Situasie)
Processor	i5-12400F
Motherboard	Gigabyte B760M DS3H
Cooling	Laminar RM1
Memory	32 GB DDR4-3200
Video Card(s)	RX 6700 XT (vandalised)
Storage	Yes.
Display(s)	MSi G2712
Case	Matrexx 55 (slightly vandalised)
Audio Device(s)	Yes.
Power Supply	Thermaltake 1000 W
Mouse	Don't disturb, cheese eating in progress...
Keyboard	Makes some noise. Probably onto something.
VR HMD	I live in real reality and don't need a virtual one.
Software	Windows 11 / 10 / 8
Benchmark Scores	My PC can run Crysis. Do I really need more than that?

System Name	BOX
Processor	Core i7 6950X @ 4,26GHz (1,28V)
Motherboard	X99 SOC Champion (BIOS F23c + bifurcation mod)
Cooling	Thermalright Venomous-X + 2x Delta 38mm PWM (Push-Pull)
Memory	Patriot Viper Steel 4000MHz CL16 4x8GB (@3240MHz CL12.12.12.24 CR2T @ 1,48V)
Video Card(s)	Titan V (~1650MHz @ 0.77V, HBM2 1GHz, Forced P2 state [OFF])
Storage	WD SN850X 2TB + Samsung EVO 2TB (SATA) + Seagate Exos X20 20TB (4Kn mode)
Display(s)	LG 27GP950-B
Case	Fractal Design Meshify 2 XL
Audio Device(s)	Motu M4 (audio interface) + ATH-A900Z + Behringer C-1
Power Supply	Seasonic X-760 (760W)
Mouse	Logitech RX-250
Keyboard	HP KB-9970
Software	Windows 10 Pro x64

System Name	Still not a thread ripper but pretty good.
Processor	Ryzen 9 7950x, Thermal Grizzly AM5 Offset Mounting Kit, Thermal Grizzly Extreme Paste
Motherboard	ASRock B650 LiveMixer (BIOS/UEFI version P3.08, AGESA 1.2.0.2)
Cooling	EK-Quantum Velocity, EK-Quantum Reflection PC-O11, D5 PWM, EK-CoolStream PE 360, XSPC TX360
Memory	Crucial/Micron DDR5-5600 (MTC20C2085S1EC56BD1) + JONSBO NF-1
Video Card(s)	XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage	Samsung 4TB 980 PRO, 2 x Optane 905p 1.5TB (striped), AMD Radeon RAMDisk
Display(s)	2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case	Lian Li PC-O11 Dynamic Black (original model)
Audio Device(s)	Corsair Commander Pro for Fans, RGB, & Temp Sensors (x4)
Power Supply	Corsair RM750x
Mouse	Logitech M575
Keyboard	Corsair Strafe RGB MK.2
Software	Windows 10 Professional (64bit)
Benchmark Scores	RIP Ryzen 9 5950x, ASRock X570 Taichi (v1.06), 128GB Micron DDR4-3200 ECC UDIMM (18ASF4G72AZ-3G2F1)

Processor	faster at instructions than yours
Motherboard	more nurturing than yours
Cooling	frostier than yours
Memory	superior scheduling & haphazardly entry than yours
Video Card(s)	better rasterization than yours
Storage	more ample than yours
Display(s)	increased pixels than yours
Case	fancier than yours
Audio Device(s)	further audible than yours
Power Supply	additional amps x volts than yours
Mouse	without as much gnawing as yours
Keyboard	less clicky than yours
VR HMD	not as odd looking as yours
Software	extra mushier than yours
Benchmark Scores	up yours

Hyper-threading (SMT)

Similar threads