• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel "Tiger Lake" Based Pentium and Celeron to Feature AVX2, an Instruction the Entry-Level Brands were Deprived Of

Joined
Jun 10, 2014
Messages
2,248 (0.95/day)
I am honestly not sure if I use the benefits of AVX2 -- I always thought that AVX always referred to 1.0 but your post made me do a bit of research, and now I can't really tell if it automatically uses 2. When I had Skylake-X and had AVX-512 it was NEVER used (which was a shame) so I had lumped AVX 2.0 into that category as well.
Most of the mentioned software probably uses AVX2.

"AVX" can be confusing, since it can refer to AVX(1) or the whole family (AVX(1), AVX2, AVX-512).
In a way you can view AVX2 as the first full 256-bit SIMD instruction set, while AVX(1) was mostly a partial 256-bit extension of SSE4, but with a new syntax. AVX2 added more operations and flexibility over AVX(1), and added full 256-bit integer support. From a software development perspective, there isn't much use in using AVX(1), since AVX2 is more flexible and complete, only at the expense of dropping support for Sandy Bridge and Ivy Bridge (SB didn't have any good AVX performance anyway). By using AVX2 you usually get the bonus feature of FMA too, which really can accelerate some algorithms.

AVX-512 is unfortunately (to my knowledge) not yet used in consumer software. This is the chicken and the egg problem, as usual. But it's important to get these features widespread so software can start to utilize it. AVX-512 will be massively powerful when we see real applications use it, there will be no going back.
 
Joined
Mar 23, 2016
Messages
4,297 (2.50/day)
Processor Ryzen 7 3800X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory PNY Anarchy-X XLR8 Red DDR4-3200 16GB kit & PNY Anarchy-X XLR8 Red DDR4-2666 16GB kit
Video Card(s) MSI GeForce RTX 2060 GAMING Z 6G
Storage Samsung 970 EVO NVMe M.2 500 GB, SanDisk Ultra II 480 GB
Display(s) Samsung SyncMaster C27H711 OC refresh rate 110Hz
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v2004
AVX-512 will be massively powerful when we see real applications use it, there will be no going back.
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.
 
Joined
Oct 2, 2015
Messages
2,692 (1.42/day)
Location
Argentina
System Name Ciel / Yukino
Processor AMD Ryzen R5 3400G / Intel Core i3 5005U
Motherboard Asus Prime B450M-A / HP 240 G5
Cooling AM3 Wraith + Spire v2 fan / Stock
Memory 2x 8GB Colorful 2666@3466MHz / 2x 4GB Hynix + Kingston DDR3L 1600MHz
Video Card(s) AMD Radeon RX Vega 11 / Intel HD 5500
Storage SSD WD Green 240GB M.2 + HDD Toshiba 2TB / SSD Kingston A400 120GB SATA
Display(s) Samsung S22F350 @ 75Hz/ Integrated 1366x768 @ 94Hz
Case Generic / Stock
Audio Device(s) Realtek ALC892 / Realtek ALC282
Power Supply Sentey XPP 525W / Power Brick
Mouse Logitech G203 / Elan Touchpad
Keyboard Generic / Stock
Software Windows 10 x64
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.
Even with that, the performance benefit will offset the lower clocks.
FMA alone can produce some very nice 40% uplifts, for example.
 
Joined
Dec 16, 2017
Messages
1,262 (1.17/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling AMD Wraith Stealth // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 10 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Except there's a hit to clockspeed, and thermals when AVX-512 is in use. Everybody loves their 5GHz overclock on the Intel side which ends up impossible with AVX-512.
I think Intel' engineers will find some way to solve that problem without losing any or very little of the speedups that AVX-512 may provide. Though it may take a while.
 
Joined
Oct 2, 2015
Messages
2,692 (1.42/day)
Location
Argentina
System Name Ciel / Yukino
Processor AMD Ryzen R5 3400G / Intel Core i3 5005U
Motherboard Asus Prime B450M-A / HP 240 G5
Cooling AM3 Wraith + Spire v2 fan / Stock
Memory 2x 8GB Colorful 2666@3466MHz / 2x 4GB Hynix + Kingston DDR3L 1600MHz
Video Card(s) AMD Radeon RX Vega 11 / Intel HD 5500
Storage SSD WD Green 240GB M.2 + HDD Toshiba 2TB / SSD Kingston A400 120GB SATA
Display(s) Samsung S22F350 @ 75Hz/ Integrated 1366x768 @ 94Hz
Case Generic / Stock
Audio Device(s) Realtek ALC892 / Realtek ALC282
Power Supply Sentey XPP 525W / Power Brick
Mouse Logitech G203 / Elan Touchpad
Keyboard Generic / Stock
Software Windows 10 x64
Fun that this took a decade. Looks like Intel is angry at the R3 3100 and any incoming Zen3 replacement.
 
Joined
May 31, 2016
Messages
2,337 (1.42/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 2700X
Motherboard MSI X470 Gaming Carbon
Cooling Corsair h115i pro rgb
Memory G.Skill Flare X 3200 CL14 @ 3600Mhz CL16
Video Card(s) RX Vega 64 Red Devil / Sapphire 5600XT pulse
Storage M.2 Samsung Evo 970 250MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD
Case Fractal Design G
Audio Device(s) realtec 5.1
Power Supply Corsair AXi 760W / Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
How come?
AMD have in some ways been better at adopting AVX2, featuring it even in their entry level products, and have supported it since Excavator(2015).
If anything, AVX-512 should be Intel's stronghold, if only they featured that across the entire lineup.
What I mean is Intel doesn't have much to play with to counter AMD so AVX is the one to go. In that Area the performance is not bad so Intel is adding it to every processor available or at least most.
 
Joined
Aug 20, 2007
Messages
13,767 (2.84/day)
System Name Pioneer
Processor Intel i9 9900k
Motherboard ASRock Z390 Taichi
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory G.SKILL TridentZ Series 32GB (4 x 8GB) DDR4-3200 @ 14-14-14-34-2T
Video Card(s) EVGA GeForce RTX 2080 SUPER XC ULTRA
Storage Mushkin Pilot-E 2TB NVMe SSD
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) VGA HDMI->Panasonic SC-HTB20/Schiit Modi MB/Asgard 2 DAC/Amp to AKG Pro K7712 Headphones
Power Supply Seasonic Prime Titanium 750W
Mouse ROCCAT Kone EMP
Keyboard WASD CODE 104-Key w/ Cherry MX Green Keyswitches, Doubleshot Vortex PBT White Transluscent Keycaps
Software Windows 10 Enterprise (Product of work, yes it's legit)
Benchmark Scores www.3dmark.com/fs/23478641 www.3dmark.com/spy/13863605 www.3dmark.com/pr/306218
Just a question - why PFSense (built on BSD) in this day and age when there is IP Fire built on Linux (SMP support) with comparable features? Granted, there's no ARM64 version but that is more or less the only drawback in my book.
There are certainly ARM64 linux versions.
 
Joined
Dec 16, 2017
Messages
1,262 (1.17/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling AMD Wraith Stealth // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 10 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Joined
Oct 2, 2015
Messages
2,692 (1.42/day)
Location
Argentina
System Name Ciel / Yukino
Processor AMD Ryzen R5 3400G / Intel Core i3 5005U
Motherboard Asus Prime B450M-A / HP 240 G5
Cooling AM3 Wraith + Spire v2 fan / Stock
Memory 2x 8GB Colorful 2666@3466MHz / 2x 4GB Hynix + Kingston DDR3L 1600MHz
Video Card(s) AMD Radeon RX Vega 11 / Intel HD 5500
Storage SSD WD Green 240GB M.2 + HDD Toshiba 2TB / SSD Kingston A400 120GB SATA
Display(s) Samsung S22F350 @ 75Hz/ Integrated 1366x768 @ 94Hz
Case Generic / Stock
Audio Device(s) Realtek ALC892 / Realtek ALC282
Power Supply Sentey XPP 525W / Power Brick
Mouse Logitech G203 / Elan Touchpad
Keyboard Generic / Stock
Software Windows 10 x64
Reminder that AVX-512 is a complete mess:

1603211176580.png
 
Joined
Aug 12, 2020
Messages
176 (1.56/day)
okay, unless there will be a consistent set implemented across the board in several generations at minimum, it won't be adopted in any remotely widespread manner.
 
Joined
Dec 16, 2017
Messages
1,262 (1.17/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling AMD Wraith Stealth // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 10 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Reminder that AVX-512 is a complete mess:

View attachment 172581
Yeah, but it seems like Ice Lake and Tiger Lake are gonna bring a bit of order... That aside, I think some of AVX-512 instructions are in reality "512" versions of instructions that were previously implemented in SSE/AVX/AVX2 instructions sets, so if you really don't need it, you could probably use the older versions.
okay, unless there will be a consistent set implemented across the board in several generations at minimum, it won't be adopted in any remotely widespread manner.
Again, IC and TL seem to bring a little order here, implementing a rather respectable amount of AVX-512 sets.

And just for the record, your username made this very hilarious :laugh:
20201020-144955.png
 
Joined
Jun 10, 2014
Messages
2,248 (0.95/day)
I think some of AVX-512 instructions are in reality "512" versions of instructions that were previously implemented in SSE/AVX/AVX2 instructions sets, so if you really don't need it, you could probably use the older versions.
AVX-512 certainly extends the feature set of AVX2 to 512-bit, and existing code can be ported very easily.
But on the instruction level it also changes the opcodes to allow much more advanced operations on data sets, which is where the true power of AVX-512 is, beyond just being a "double AVX2". AVX-512 is getting close to being a "sub instruction set" of x86.

The challenge of all versions of AVX is the difficulty of using it, it requires expert level programmers to gain substantial performance gains. But the good news is that just enabling automatic optimizations usually gives ~10-30% performance gains "for free" (probably >50% with some minor effort), since the compiler can auto-vectorize and unroll some things, but in order to get that >10x performance gain, it still requires handcrafted low-level code. I believe compilers have some potential to improve here, but ultimately they can only deal with the code written by the programmer.

Again, IC and TL seem to bring a little order here, implementing a rather respectable amount of AVX-512 sets.
Some of these feature sets are mostly relevant to enterprise users, like those "AI" features.
The good thing about having feature sets is that it makes it easier for e.g. AMD to implement the relevant features for consumers.
 
Joined
Apr 24, 2020
Messages
562 (2.51/day)
AVX-512 certainly extends the feature set of AVX2 to 512-bit, and existing code can be ported very easily.
But on the instruction level it also changes the opcodes to allow much more advanced operations on data sets, which is where the true power of AVX-512 is, beyond just being a "double AVX2". AVX-512 is getting close to being a "sub instruction set" of x86.

The challenge of all versions of AVX is the difficulty of using it, it requires expert level programmers to gain substantial performance gains. But the good news is that just enabling automatic optimizations usually gives ~10-30% performance gains "for free" (probably >50% with some minor effort), since the compiler can auto-vectorize and unroll some things, but in order to get that >10x performance gain, it still requires handcrafted low-level code. I believe compilers have some potential to improve here, but ultimately they can only deal with the code written by the programmer.
ISPC and OpenCL target AVX512, and are probably what I'd recommend to anyone who is writing any serious amount of CPU-SIMD code these days. That is, if you need SIMD, but for some reason GPU-SIMD is too high latency or something, so you need to fall back to CPU-SIMD.

Intrinsics are still useful for a few applications, but its far harder to use intrinsics than to use a dedicated language like ISPC: https://ispc.github.io/

If that's still too much to ask for, then "#pragma omp simd" is the next recommendation. Works in C, C++, and Fortran on a variety of compilers (like GCC and LLVM). A shame about Microsoft Visual Studio... you can't win them all.
 
Joined
Mar 1, 2008
Messages
261 (0.06/day)
Location
Antwerp, Belgium
They needed AVX2? I thought they only needed plain AVX? (Missing on the Phenom II's pre-bulldozer, lots of people didn't upgrade to Bulldozer because it was worse from those, until a few titles in their DRM used AVX...)
That's a bit irrelevant isn't it, considering even the newest Pentium's and Celeron's don't support AVX1.

Reminder that AVX-512 is a complete mess:

View attachment 172581
That only looks like a mess because it's full of obsolete things. You can remove Knights Landing, Knights Mill and Cannon Lake.
Skylake-SP, Cascade Lake & Cooper Lake are all niche workstation and 8-way multiprocessing server products. Not really relevant for desktop software.
That leaves Skylake-X, Ice Lake and Tiger Lake. Again Skylake-X was a niche product.
Whatever's that left for actual desktop usage, is not a mess.

Edit:
I forgot to add that in any case all CPU's support AVX-512F (AVX-512 Foundation). If you program for AVX-512, you can always rely on AVX-512F instructions and check for more (required anyway because of fall-back code).
 
Last edited:
Top