• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Hyperthreading in i7 vs P4?

Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
Main reason why Intel is insisting on HT is because they need a very small chunk of physical die for HT functionality and they get boost ranging from 20-50% in various applications, meaning they need very little effort for some major performance boosts. Plus it's using more of the existing CPU instead of passing around half empty threads. So to speak.
 
Joined
May 3, 2014
Messages
965 (0.26/day)
System Name Sham Pc
Processor i5-2500k @ 4.33
Motherboard INTEL DZ77SL 50K
Cooling 2 bay res. "2L of fluid in loop" 1x480 2x360
Memory 16gb 4x4 kingstone 1600 hyper x fury black
Video Card(s) hfa2 gtx 780 @ 1306/1768 (xspc bloc)
Storage 1tb wd red 120gb kingston on the way os, 1.5Tb wd black, 3tb random WD rebrand
Display(s) cibox something or other 23" 1080p " 23 inch downstairs. 52 inch plasma downstairs 15" tft kitchen
Case 900D
Audio Device(s) on board
Power Supply xion gaming seriese 1000W (non modular) 80+ bronze
Software windows 10 pro x64
i just think of ht as threads not cores lol. id go off on a rant if i ever thought about them being cores.
Like i have to call amd modules. to me a core is a core. and a module is a core with extra bits that lets it do 2x as many of some operations in the same clock. and ht is just forcing an extra thread to work in the core letting it do 2x as many of some operations in the same clock. id prefer a Pentium d dual core, to a Pentium 4 with hyper threading. id prefer an i5 quad to an i3 with hyper threading. and id prefer a 6 core i7 with HT turned off to a i7 4 core with ht turned on..
this is because i count the actual amount of cores. and not the amount of threads it can sometimes do in a perfect situation. whether thats because of lack of fp and other components in a module. or because its just forcing things to be computed when its able to thanx to scheduling. its not really the same as having actual cores.
Still like i said i cant call them cores because they just arent.. its like buying 2 tv's on ebay. and one of them is fine but the other one will only come on whilst the adverts are on. id be pissed.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Main reason why Intel is insisting on HT is because they need a very small chunk of physical die for HT functionality and they get boost ranging from 20-50% in various applications, meaning they need very little effort for some major performance boosts. Plus it's using more of the existing CPU instead of passing around half empty threads. So to speak.

Where do you get those numbers from? Even in highly parallel workloads like 7zip, I've seen this to not be the case, 30% is about the max you're going to get, even now. I've done a little bit of testing by disabling/enabling cores and HT to get some numbers for 7zip. What's even more interesting is how having HT disabled with 4 cores enabled makes 4-thread 7zip workloads perform better than with HT enabled, probably because of how the threads are being scheduled. In this case, enabling HT on 4c/8t only provides a 20% boost over 4c/4t in performance when running 7zip with 8 threads. HOWEVER, notice how 4c/8t is almost exactly twice as fast as 2c/4t, which would indicate that more cores means better scaling. If you were to do the same test with a 4-module CPU, you would see better scaling than HT. HT uses extra resources when they're available. Where modules have dedicated extra resource for running a second thread. I think you can guess which results in better integer performance scaling. :p

HT isn't what makes your i7 fast, man. ;)
 
Last edited:
Joined
Apr 30, 2006
Messages
1,181 (0.18/day)
Processor 7900
Motherboard Rampage Apex
Cooling H115i
Memory 64GB TridentZ 3200 14-14-14-34-1T
Video Card(s) Fury X
Case Corsair 740
Audio Device(s) 8ch LPCM via HDMI to Yamaha Z7 Receiver
Power Supply Corsair AX860
Mouse G903
Keyboard G810
Software 8.1 x64
Even in highly parallel workloads like 7zip, I've seen this to not be the case, 30% is about the max you're going to get, even now.

Hyperthreading isn't for parallel workloads which most of the threads are processing similar instructions that are using the same execution units in each core.

The purpose of hyperthreading is to minimize the idle execution units when processing different threads. Like, for example, playing call of duty while recording it with fraps and also listening to music in the background. Or watching some netflix while compressing a backup with 7zip running in the background.

 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
HT isn't what makes your i7 fast, man. ;)
24% increase with 8 threads.
11% decrease with 4 threads.
33% increase when looking at home-field advantage (8 on 8, 4 on 4). This is what programmers developing multithreaded software make as the default behavior.

I think the point that is easy to miss is that in the scenario where performance suffered, I think Windows can be safely blamed for allocating the threads improperly (e.g. it had two threads running on one physical core). Regardless of underlying reason why it was slower, the processor also was ~50% idle where the other two instances were 0% idle (presumably). Just because this one task took a small hit covers up the fact that the processor was ready and able to do other, unrelated work simultaneously.

Hyper-threading is typically responsible for about -5-50% performance change depending on workload.

The fact the processor handled 200% threads slightly better than 100% threads speaks volumes to both Windows and the processor's scheduler.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
24% increase with 8 threads.
11% decrease with 4 threads.
33% increase when looking at home-field advantage (8 on 8, 4 on 4). This is what programmers developing multithreaded software make as the default behavior.

I think the point that is easy to miss is that in the scenario where performance suffered, I think Windows can be safely blamed for allocating the threads improperly (e.g. it had two threads running on one physical core). Regardless of underlying reason why it was slower, the processor also was ~50% idle where the other two instances were 0% idle (presumably). Just because this one task took a small hit covers up the fact that the processor was ready and able to do other, unrelated work simultaneously.

Hyper-threading is typically responsible for about -5-50% performance change depending on workload.

The fact the processor handled 200% threads slightly better than 100% threads speaks volumes to both Windows and the processor's scheduler.

I would agree with you if the improvement from 100% to 200% wasn't something along that lines of 2% gain from 8 threads to 16 threads (4c/8t case) and ~6% for 4 threads to 8 threads (2c/4t case). 2% is within margin for error, so I would consider that unchanged from the previous value. The values for "percent increase" is the improvement in performance is over the last number of threads. So improvement for 8 threads would be over 4 threads. The part that makes me think it's the scheduler is not the improvements over the max number of threads, but rather the difference in improvement for 4c/8t @ 8 threads versus 2c/4t @ 4 threads where the instance with no HT threads excelled. Also the other big thing that makes me think the scheduler is at fault is that 4c/4t running 4 threads was faster than the 4c/8t configuration running 4 threads.

Also the table here is confusing, I should redo the percent increases, which really means the amount of performance per core over 1 thread, not improvement from 4 threads to 8 threads which is really want I want here.

Hyperthreading isn't for parallel workloads which most of the threads are processing similar instructions that are using the same execution units in each core.

Actually it is because no workload is purely using one part of the CPU. All HT does is takes advantage of the fact that super scalar system architectures gave Intel the opportunity to cram data into the pipeline while it waits for some reason. It sees an opening, so it puts something in, instructions always get put into the pipeline at the beginning, so it's not really just grabbing an unused part, it wait for unused parts. Also saying that HT isn't for parallel workloads is really funny since the only purposes of more than 1 core or 1 thread is for parallel thread-like workloads, not GPGPU-like tasks where all you're really doing is transforming matrices, just multiple threads in general because anything happening at the same time is "parallel processing", weather that task is the same or different. Stuff running in parallel doesn't imply that they're the same instructions either or that parts of the CPU can't be shared. A lot of reasons for more threads is to do a task alone and asynchronously which would benefit the most from HT. They might share less of the CPU because of similar kinds of instructions used, but not certainly not all of. HT suffers when you have a lot of context switching or if a pipeline stall occurs because it needs to wipe out the entire pipeline to recover from the stall if it was due to branch misprediction. HT suffers even more when you have a ton of locking going on which can slow everything else down depending on how much locking is going on.

Also keep in mind what you just said
The purpose of hyperthreading is to minimize the idle execution units when processing different threads. Like, for example, playing call of duty while recording it with fraps and also listening to music in the background. Or watching some netflix while compressing a backup with 7zip running in the background.
A lot of applications (in particular games,) have a really hard time making code run in tandem (in the sense that two threads are trying to do parts of the same task,) versus say, a game logic thread, versus a network communications thread, versus threads that do network access asyncronously with the network comm. thread, versus a thread that goes GPU dispatch for rendering. So one application can have a lot of different threads, so to say that HT isn't designed for parallel workloads is funny because you're clumping "parallel workloads" with tasks you would give GPGPU devices, which CPUs suck at doing in the first place.
 

brandonwh64

Addicted to Bacon and StarCrunches!!!
Joined
Sep 6, 2009
Messages
19,542 (3.66/day)
whatever the case HAT you are LONG overdue for a full system upgrade but I guss you you already know that ;)

HAT did have an I7 setup but I now have the cpu and he went back to the q6600.
 
Joined
Oct 10, 2009
Messages
929 (0.17/day)
System Name Desktop | Laptop
Processor AMD Ryzen 7 5800X3D | Intel Core i7 7700HQ
Motherboard MAG X570S Torpedo Max| Neptune KLS HM175
Cooling Corsair H100x | Twin fan, fin stack & heat pipes
Memory 32GB G.Skill F4-3600C16-8GVK @ 3600MHz / 16-16-16-36-1T | 16GB DDR4 @ 2400MHz / 17-17-17-39-2T
Video Card(s) EVGA RTX 3080 Ti FTW3 Ultra | GTX 1050 Ti 4GB
Storage Kingston KC3000 1TB + Kingston KC3000 2TB + Samsung 860 EVO 1TB | 970 Evo 500GB
Display(s) 32" Dell G3223Q (2160p @ 144Hz) | 17" IPS 1920x1080P
Case Fractal Meshify 2 Compact | Aspire V Nitro BE
Audio Device(s) ifi Audio ZEN DAC V2 + Focal Radiance / HyperX Solocast
Power Supply Super Flower Leadex V Platinum Pro 1000W | 150W
Mouse Razer Viper Ultimate | Logitech MX Anywhere 2
Keyboard Razer Huntsman V2 Optical (Linear Red)
Software Windows 11 Pro x64
Joined
Apr 30, 2006
Messages
1,181 (0.18/day)
Processor 7900
Motherboard Rampage Apex
Cooling H115i
Memory 64GB TridentZ 3200 14-14-14-34-1T
Video Card(s) Fury X
Case Corsair 740
Audio Device(s) 8ch LPCM via HDMI to Yamaha Z7 Receiver
Power Supply Corsair AX860
Mouse G903
Keyboard G810
Software 8.1 x64
I would agree with you if

I was just trying to point out that using 7zip is a terrible way to measure hyperthreading performance.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
I was just trying to point out that using 7zip is a terrible way to measure hyperthreading performance.
...and I'm telling you that it's not as terrible as you think, there are just limits to what it can do.
 
Joined
Jun 27, 2011
Messages
6,687 (1.43/day)
Processor 7800x3d
Motherboard Gigabyte B650 Auros Elite AX
Cooling Custom Water
Memory GSKILL 2x16gb 6000mhz Cas 30 with custom timings
Video Card(s) MSI RX 6750 XT MECH 2X 12G OC
Storage Adata SX8200 1tb with Windows, Samsung 990 Pro 2tb with games
Display(s) HP Omen 27q QHD 165hz
Case ThermalTake P3
Power Supply SuperFlower Leadex Titanium
Software Windows 11 64 Bit
Benchmark Scores CB23: 1811 / 19424 CB24: 1136 / 7687
In my usage HT is hit or miss. Most of the programs I use don't seem to make use of it so it is just not something I feel I should pay for.
 
Joined
Oct 21, 2005
Messages
6,880 (1.02/day)
Location
USA
System Name Computer of Theseus
Processor Intel i9-12900KS: 50x Pcore multi @ 1.18Vcore (target 1.275V -100mv offset)
Motherboard EVGA Z690 Classified
Cooling Noctua NH-D15S, 2xThermalRight TY-143, 4xNoctua NF-A12x25,3xNF-A12x15, 2xAquacomputer Splitty9Active
Memory G-Skill Trident Z5 (32GB) DDR5-6000 C36 F5-6000J3636F16GX2-TZ5RK
Video Card(s) EVGA Geforce 3060 XC Black Gaming 12GB
Storage 1x Samsung 970 Pro 512GB NVMe (OS), 2x Samsung 970 Evo Plus 2TB (data 1 and 2), ASUS BW-16D1HT
Display(s) Dell S3220DGF 32" 2560x1440 165Hz Primary, Dell P2017H 19.5" 1600x900 Secondary, Ergotron LX arms.
Case Lian Li O11 Air Mini
Audio Device(s) Audiotechnica ATR2100X-USB, El Gato Wave XLR Mic Preamp, ATH M50X Headphones, Behringer 302USB Mixer
Power Supply Super Flower Leadex Platinum SE 1000W 80+ Platinum White
Mouse Zowie EC3-C
Keyboard Vortex Multix 87 Winter TKL (Gateron G Pro Yellow)
Software Win 10 LTSC 21H2
Running MSE full virus scan while playing SC2 is possible with HT, don't notice any performance hit. With i5 at same clock it seemed to have more latency. Can't really tell a big difference otherwise, i7 is overkill for my use.
 
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
I was just trying to point out that using 7zip is a terrible way to measure hyperthreading performance.

I disagree with that as 7zip is with 8 threads by far the fastest compressor and also results in smallest archives. I'm using custom Ultra profile for LZMA2 with larger dictionary and word size, eating up around 12-14GB of RAM when working. But it goes through gigabytes of data like it's nothing.
 

OneMoar

There is Always Moar
Joined
Apr 9, 2010
Messages
8,746 (1.70/day)
Location
Rochester area
System Name RPC MK2.5
Processor Ryzen 5800x
Motherboard Gigabyte Aorus Pro V2
Cooling Enermax ETX-T50RGB
Memory CL16 BL2K16G36C16U4RL 3600 1:1 micron e-die
Video Card(s) GIGABYTE RTX 3070 Ti GAMING OC
Storage ADATA SX8200PRO NVME 512GB, Intel 545s 500GBSSD, ADATA SU800 SSD, 3TB Spinner
Display(s) LG Ultra Gear 32 1440p 165hz Dell 1440p 75hz
Case Phanteks P300 /w 300A front panel conversion
Audio Device(s) onboard
Power Supply SeaSonic Focus+ Platinum 750W
Mouse Kone burst Pro
Keyboard EVGA Z15
Software Windows 11 +startisallback
Somebody already said this but you people aren't getting the point
HYPER THREADING IS NOT FOR PARALLEL TASKING
it is for utilizing cpu resources that would normally sit unused during a given clock cycle they aren't "cores" so please stop thinking of them as such
given the above 7 zip is INDEED a terrible benchmark so is gaming
infant any type of work load that requires PARALLEL processing of related data is EXACT WHAT YOU DON'T WANNA USE hyper-THREADING FOR

a crude analogy would be using two separate kitchen mixers to make cake batter with one for mixing flour and sugar and baking powder and one for mixing eggs and milk (parallel)
what would be a good idea would be to use one mixer for the cake and another for the frosting (hyper threading)
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Also the other big thing that makes me think the scheduler is at fault is that 4c/4t running 4 threads was faster than the 4c/8t configuration running 4 threads.
As I pointed out, that's because the processor is ready to accept more work. That is not necessarily a bad thing. The fact of the matter is no programmer, with a task like 7-zip, is going to only use half of the cores available unless the user explicitly tells it to use less. It would use 8 on 4c/8t and 4 on 4c/4t which is a 33% increase--not too shabby. Apples to apples, we'd have to compare 4 threads on 4c/8t to 2 threads on 4c/4t. The former would thoroughly trounce the latter.


a crude analogy would be using two separate kitchen mixers to make cake batter with one for mixing flour and sugar and baking powder and one for mixing eggs and milk (parallel)
what would be a good idea would be to use one mixer for the cake and another for the frosting (hyper threading)
The analogy would be needing to make four batches of cake dough with one mixer. HT enabled would add 33% more to each batch and pull it off when it is prepared for cooking. A normal mixer would have to be ran four times. With HT enabled, it would only have to run three times to get four batches mixed.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Somebody already said this but you people aren't getting the point
HYPER THREADING IS NOT FOR PARALLEL TASKING
it is for utilizing cpu resources that would normally sit unused during a given clock cycle they aren't "cores" so please stop thinking of them as such
given the above 7 zip is INDEED a terrible benchmark so is gaming
infant any type of work load that requires PARALLEL processing of related data is EXACT WHAT YOU DON'T WANNA USE hyper-THREADING FOR

a crude analogy would be using two separate kitchen mixers to make cake batter with one for mixing flour and sugar and baking powder and one for mixing eggs and milk (parallel)
what would be a good idea would be to use one mixer for the cake and another for the frosting (hyper threading)

You fail to understand what the CPU is doing under the hood. Hyperthreading is simply pipeline level parallelism and depends on the particular op code instructions being executed as well as the op code instructions that were run before it that are still in the pipeline. HT just tries to fill the gaps in the pipeline to squeeze a little more juice out of it, that is all. Depending on the code running and what kinds of instructions are being executed, parallel tasks that are sufficiently complex will benefit from HT but if all you're doing is adding or strictly using the ALU or any task that a GPU would excel at, you're not going to realize any benefit from it because a single component in the CPU is becoming your bottleneck, but using the blanket statement that HT isn't for parallel applications is ludicrous and I think we can see that 7-zip takes advantage of it pretty well. A better statement is that hyper-threading is not an optimal solution for highly parallel tasks, but improves efficiency by getting more done without adding more cores or altering too much of the core itself.

The simple point is that you can only squeeze so much power out of a CPU for any given task and that some tasks benefit from HT more than others, but saying "it isn't for parallel processing" is nuts because it's extra performance you might not otherwise have. It would be stupid to reject it.

Also, do you find that using all caps makes you more correct or does it just make you feel better?
 

OneMoar

There is Always Moar
Joined
Apr 9, 2010
Messages
8,746 (1.70/day)
Location
Rochester area
System Name RPC MK2.5
Processor Ryzen 5800x
Motherboard Gigabyte Aorus Pro V2
Cooling Enermax ETX-T50RGB
Memory CL16 BL2K16G36C16U4RL 3600 1:1 micron e-die
Video Card(s) GIGABYTE RTX 3070 Ti GAMING OC
Storage ADATA SX8200PRO NVME 512GB, Intel 545s 500GBSSD, ADATA SU800 SSD, 3TB Spinner
Display(s) LG Ultra Gear 32 1440p 165hz Dell 1440p 75hz
Case Phanteks P300 /w 300A front panel conversion
Audio Device(s) onboard
Power Supply SeaSonic Focus+ Platinum 750W
Mouse Kone burst Pro
Keyboard EVGA Z15
Software Windows 11 +startisallback
As I pointed out, that's because the processor is ready to accept more work. That is not necessarily a bad thing. The fact of the matter is no programmer, with a task like 7-zip, is going to only use half of the cores available unless the user explicitly tells it to use less. It would use 8 on 4c/8t and 4 on 4c/4t which is a 33% increase--not too shabby. Apples to apples, we'd have to compare 4 threads on 4c/8t to 2 threads on 4c/4t. The former would thoroughly trounce the latter.



The analogy would be needing to make four batches of cake dough with one mixer. HT enabled would add 33% more to each batch and pull it off when it is prepared for cooking. A normal mixer would have to be ran four times. With HT enabled, it would only have to run three times to get four batches mixed.
assume you have TWO mixers but need to bake 3 cakes 1 chocolate 1 vanilla and 2 carrot now each mixing bowl only holds so much and each mixer can only do so much work

now a carrot cake can be made by starting with a vanilla cake mix and adding other ingredients to the vanilla cake batter this is where hyper threading comes in I can put both vanilla cake mixes in one mixer now conversely I can make a vanilla cake by taking a carrot cake recipe and omitting the carrots and adding extra vanilla and other stuff

now a chocolate cake while sharing some of the same base components is a totally different beast while it shares some components with carrot and vanilla its best left in its own mixer now if I had 3 mixers I don't need to fuss with reserving some of the vanilla batter for my carrot cake and I can just dedicate one mixer to each cake(Physical cores) now of course you can share things like measuring cups,spoons,bowls,pans between the 3(hyper threading) but If I need to measure 1/2 cup of sugar for the carrot cake and 1 & 3/4 cup flower for the carrot cake well I only have one 2 cup measuring cup so one is going to haft to wait until I am done with the other
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
Why on Earth would 4c/8t be runnng with only 2 threads for comparison with 4c/4t? Thats like comparing apples and oranges.

Use same thread count for both and the HT enabled will triumph pretty much every time easily.
 
Joined
Nov 18, 2010
Messages
7,125 (1.45/day)
Location
Rīga, Latvia
System Name HELLSTAR
Processor AMD RYZEN 9 5950X
Motherboard ASUS Strix X570-E
Cooling 2x 360 + 280 rads. 3x Gentle Typhoons, 3x Phanteks T30, 2x TT T140 . EK-Quantum Momentum Monoblock.
Memory 4x8GB G.SKILL Trident Z RGB F4-4133C19D-16GTZR 14-16-12-30-44
Video Card(s) Sapphire Pulse RX 7900XTX + under waterblock.
Storage Optane 900P[W11] + WD BLACK SN850X 4TB + 750 EVO 500GB + 1TB 980PRO[FEDORA]
Display(s) Philips PHL BDM3270 + Acer XV242Y
Case Lian Li O11 Dynamic EVO
Audio Device(s) Sound Blaster ZxR
Power Supply Fractal Design Newton R3 1000W
Mouse Razer Basilisk
Keyboard Razer BlackWidow V3 - Yellow Switch
Software FEDORA 39 / Windows 11 insider
You guys went into oblivion :D.

Anyways... HT and data prefetch logic differs greatly in between generations... And HT does work now...(only exception are wooden old game engines and progs made by lazy and uneducated coders) as an old dual socket mobo user I can assure, that during the ancient days of dual PIII then Socket A, then 940, then Socket F, barely anything touched the second stone in the socket in daily tasks so it was with first HT enabled P4 crap... the netburst was so unsuccessful that HT didn't help also much... K8 killed it. But thus HT gained a bit bad reputation at its debut, as it even made things run slower. But not anymore. If the program is compiled recently, it is aware of many CPU features and thus compilation flags are enabled and optimizations are made... (yea the slow flag upon detecting AMD CPU lol :D)

Last true AMD core was K10.5, then they split it to smaller shorter pipe cores and I would like to say, they are comparable with HT, the design win in this scenario is power consumption as you can power off more parts of the cpu for idle process. You know they just can't use HT due to legal reasons, the idea is fine. If it runs better without adding much costs? 5-10% gain is about the same we gain going to next generation of CPU's. Those ain't old +100% more speed days...
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Why on Earth would 4c/8t be runnng with only 2 threads for comparison with 4c/4t? Thats like comparing apples and oranges.

Use same thread count for both and the HT enabled will triumph pretty much every time easily.

Then compare just the 8 threads with 4c/8t with 4 threads at 4c/4t and 2c/4t. It gets the same point across...

Is it an improvement: YES!
The debate is how much of an improvement it is in certain situations, not whether it's an improvement at all.

I think most complaints about HT are more resultant of a CPU scheduler not taking advantage of certain resources in the right manner, not HT making matters worse.
 
Top