• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel "Tiger Lake" Based Pentium and Celeron to Feature AVX2, an Instruction the Entry-Level Brands were Deprived Of

Joined
Jun 10, 2014
Messages
2,890 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I am honestly not sure if I use the benefits of AVX2 -- I always thought that AVX always referred to 1.0 but your post made me do a bit of research, and now I can't really tell if it automatically uses 2. When I had Skylake-X and had AVX-512 it was NEVER used (which was a shame) so I had lumped AVX 2.0 into that category as well.
Most of the mentioned software probably uses AVX2.

"AVX" can be confusing, since it can refer to AVX(1) or the whole family (AVX(1), AVX2, AVX-512).
In a way you can view AVX2 as the first full 256-bit SIMD instruction set, while AVX(1) was mostly a partial 256-bit extension of SSE4, but with a new syntax. AVX2 added more operations and flexibility over AVX(1), and added full 256-bit integer support. From a software development perspective, there isn't much use in using AVX(1), since AVX2 is more flexible and complete, only at the expense of dropping support for Sandy Bridge and Ivy Bridge (SB didn't have any good AVX performance anyway). By using AVX2 you usually get the bonus feature of FMA too, which really can accelerate some algorithms.

AVX-512 is unfortunately (to my knowledge) not yet used in consumer software. This is the chicken and the egg problem, as usual. But it's important to get these features widespread so software can start to utilize it. AVX-512 will be massively powerful when we see real applications use it, there will be no going back.
 
Joined
Mar 23, 2016
Messages
4,839 (1.65/day)
Processor Ryzen 9 5900X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory VENGEANCE LPX 2 x 16GB DDR4-3600 C18 OCed 3800
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500 GB, 870 QVO 1 TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v22H2
AVX-512 will be massively powerful when we see real applications use it, there will be no going back.
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.
 
Joined
Oct 2, 2015
Messages
2,986 (0.96/day)
Location
Argentina
System Name Ciel
Processor AMD Ryzen R5 5600X
Motherboard Asus Tuf Gaming B550 Plus
Cooling ID-Cooling 224-XT Basic
Memory 2x 16GB Kingston Fury 3600MHz@3933MHz
Video Card(s) Gainward Ghost 3060 Ti 8GB + Sapphire Pulse RX 6600 8GB
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB
Display(s) Gigabyte G27Q + AOC 19'
Case Cougar MX410 Mesh-G
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W
Mouse Logitech G203
Keyboard VSG Alnilam
Software Windows 11 x64
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.
Even with that, the performance benefit will offset the lower clocks.
FMA alone can produce some very nice 40% uplifts, for example.
 
Joined
Dec 16, 2017
Messages
2,722 (1.19/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling Cooler Master Hyper 212 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 11 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.

I think Intel' engineers will find some way to solve that problem without losing any or very little of the speedups that AVX-512 may provide. Though it may take a while.
 
Joined
Oct 2, 2015
Messages
2,986 (0.96/day)
Location
Argentina
System Name Ciel
Processor AMD Ryzen R5 5600X
Motherboard Asus Tuf Gaming B550 Plus
Cooling ID-Cooling 224-XT Basic
Memory 2x 16GB Kingston Fury 3600MHz@3933MHz
Video Card(s) Gainward Ghost 3060 Ti 8GB + Sapphire Pulse RX 6600 8GB
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB
Display(s) Gigabyte G27Q + AOC 19'
Case Cougar MX410 Mesh-G
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W
Mouse Logitech G203
Keyboard VSG Alnilam
Software Windows 11 x64
Fun that this took a decade. Looks like Intel is angry at the R3 3100 and any incoming Zen3 replacement.
 
Joined
May 31, 2016
Messages
4,323 (1.51/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 16GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
How come?
AMD have in some ways been better at adopting AVX2, featuring it even in their entry level products, and have supported it since Excavator(2015).
If anything, AVX-512 should be Intel's stronghold, if only they featured that across the entire lineup.
What I mean is Intel doesn't have much to play with to counter AMD so AVX is the one to go. In that Area the performance is not bad so Intel is adding it to every processor available or at least most.
 
Joined
Aug 20, 2007
Messages
20,714 (3.41/day)
System Name Pioneer
Processor Ryzen R9 7950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage 2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches
Software Windows 11 Enterprise (legit), Gentoo Linux x64
Just a question - why PFSense (built on BSD) in this day and age when there is IP Fire built on Linux (SMP support) with comparable features? Granted, there's no ARM64 version but that is more or less the only drawback in my book.

There are certainly ARM64 linux versions.
 
Joined
Jun 10, 2014
Messages
2,890 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I think Intel' engineers will find some way to solve that problem without losing any or very little of the speedups that AVX-512 may provide. Though it may take a while.
They already have, in Ice Lake-SP.
 
Joined
Dec 16, 2017
Messages
2,722 (1.19/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling Cooler Master Hyper 212 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 11 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Joined
Oct 2, 2015
Messages
2,986 (0.96/day)
Location
Argentina
System Name Ciel
Processor AMD Ryzen R5 5600X
Motherboard Asus Tuf Gaming B550 Plus
Cooling ID-Cooling 224-XT Basic
Memory 2x 16GB Kingston Fury 3600MHz@3933MHz
Video Card(s) Gainward Ghost 3060 Ti 8GB + Sapphire Pulse RX 6600 8GB
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB
Display(s) Gigabyte G27Q + AOC 19'
Case Cougar MX410 Mesh-G
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W
Mouse Logitech G203
Keyboard VSG Alnilam
Software Windows 11 x64
Reminder that AVX-512 is a complete mess:

1603211176580.png
 
Joined
Aug 12, 2020
Messages
1,117 (0.84/day)
okay, unless there will be a consistent set implemented across the board in several generations at minimum, it won't be adopted in any remotely widespread manner.
 
Joined
Dec 16, 2017
Messages
2,722 (1.19/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling Cooler Master Hyper 212 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 11 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Reminder that AVX-512 is a complete mess:

View attachment 172581
Yeah, but it seems like Ice Lake and Tiger Lake are gonna bring a bit of order... That aside, I think some of AVX-512 instructions are in reality "512" versions of instructions that were previously implemented in SSE/AVX/AVX2 instructions sets, so if you really don't need it, you could probably use the older versions.
okay, unless there will be a consistent set implemented across the board in several generations at minimum, it won't be adopted in any remotely widespread manner.
Again, IC and TL seem to bring a little order here, implementing a rather respectable amount of AVX-512 sets.

And just for the record, your username made this very hilarious :laugh:
20201020-144955.png
 
Joined
Jun 10, 2014
Messages
2,890 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I think some of AVX-512 instructions are in reality "512" versions of instructions that were previously implemented in SSE/AVX/AVX2 instructions sets, so if you really don't need it, you could probably use the older versions.
AVX-512 certainly extends the feature set of AVX2 to 512-bit, and existing code can be ported very easily.
But on the instruction level it also changes the opcodes to allow much more advanced operations on data sets, which is where the true power of AVX-512 is, beyond just being a "double AVX2". AVX-512 is getting close to being a "sub instruction set" of x86.

The challenge of all versions of AVX is the difficulty of using it, it requires expert level programmers to gain substantial performance gains. But the good news is that just enabling automatic optimizations usually gives ~10-30% performance gains "for free" (probably >50% with some minor effort), since the compiler can auto-vectorize and unroll some things, but in order to get that >10x performance gain, it still requires handcrafted low-level code. I believe compilers have some potential to improve here, but ultimately they can only deal with the code written by the programmer.

Again, IC and TL seem to bring a little order here, implementing a rather respectable amount of AVX-512 sets.
Some of these feature sets are mostly relevant to enterprise users, like those "AI" features.
The good thing about having feature sets is that it makes it easier for e.g. AMD to implement the relevant features for consumers.
 
Joined
Apr 24, 2020
Messages
2,520 (1.75/day)
AVX-512 certainly extends the feature set of AVX2 to 512-bit, and existing code can be ported very easily.
But on the instruction level it also changes the opcodes to allow much more advanced operations on data sets, which is where the true power of AVX-512 is, beyond just being a "double AVX2". AVX-512 is getting close to being a "sub instruction set" of x86.

The challenge of all versions of AVX is the difficulty of using it, it requires expert level programmers to gain substantial performance gains. But the good news is that just enabling automatic optimizations usually gives ~10-30% performance gains "for free" (probably >50% with some minor effort), since the compiler can auto-vectorize and unroll some things, but in order to get that >10x performance gain, it still requires handcrafted low-level code. I believe compilers have some potential to improve here, but ultimately they can only deal with the code written by the programmer.

ISPC and OpenCL target AVX512, and are probably what I'd recommend to anyone who is writing any serious amount of CPU-SIMD code these days. That is, if you need SIMD, but for some reason GPU-SIMD is too high latency or something, so you need to fall back to CPU-SIMD.

Intrinsics are still useful for a few applications, but its far harder to use intrinsics than to use a dedicated language like ISPC: https://ispc.github.io/

If that's still too much to ask for, then "#pragma omp simd" is the next recommendation. Works in C, C++, and Fortran on a variety of compilers (like GCC and LLVM). A shame about Microsoft Visual Studio... you can't win them all.
 
Joined
Mar 1, 2008
Messages
281 (0.05/day)
Location
Antwerp, Belgium
They needed AVX2? I thought they only needed plain AVX? (Missing on the Phenom II's pre-bulldozer, lots of people didn't upgrade to Bulldozer because it was worse from those, until a few titles in their DRM used AVX...)

That's a bit irrelevant isn't it, considering even the newest Pentium's and Celeron's don't support AVX1.

Reminder that AVX-512 is a complete mess:

View attachment 172581

That only looks like a mess because it's full of obsolete things. You can remove Knights Landing, Knights Mill and Cannon Lake.
Skylake-SP, Cascade Lake & Cooper Lake are all niche workstation and 8-way multiprocessing server products. Not really relevant for desktop software.
That leaves Skylake-X, Ice Lake and Tiger Lake. Again Skylake-X was a niche product.
Whatever's that left for actual desktop usage, is not a mess.

Edit:
I forgot to add that in any case all CPU's support AVX-512F (AVX-512 Foundation). If you program for AVX-512, you can always rely on AVX-512F instructions and check for more (required anyway because of fall-back code).
 
Last edited:
Top