Intel Previews AVX10 ISA, Next-Gen E-Cores to get AVX-512 Capabilities

T0@st · Jul 25, 2023

Intel has published a preview article covering its new AVX10 ISA (Instruction Set Architecture)—the announcement reveals that both P-Cores & E-Cores (on next-gen processors) will be getting support for AVX-512. Team Blue stated: "Intel AVX10 represents a major shift to supporting a high-performance vector ISA across future Intel processors. It allows the developer to maintain a single code-path that achieves high performance across all Intel platforms with the minimum of overhead checking for feature support. Future development of the Intel AVX10 ISA will continue to provide a rich, flexible, and consistent environment that optimally supports both Server and Client products."

Due to technical issues (E-core related), Intel decided to disable AVX-512 for Alder Lake and Raptor Lake client-oriented CPU lineups. AMD has recently adopted the fairly new instruction set for its Ryzen 7040 mobile series, so it is no wonder that Team Blue is attempting to reintroduce it in the near future—AVX-512 was last seen working properly on Rocket and Tiger Lake chips. AVX10 implementation is expected to debut with Granite Rapids (according to Longhorn), and VideoCardz reckons that Intel will get advanced instructions for Efficiency cores working with its Clearwater Forest CPU architecture.

View at TechPowerUp Main Site | Source

enb141 · Jul 25, 2023

So now blu ray will work again?

Six_Times · Jul 25, 2023

Does this mean AVX512 will be implemented again for Meteor Lake Ultra ?

tabascosauz · Jul 25, 2023

enb141 said:
So now blu ray will work again?

BluRay had nothing to do with AVX. It was SGX that was full of security holes and Intel got rid of it.

ncrs · Jul 25, 2023

Six_Times said:
Does this mean AVX512 will be implemented again for Meteor Lake Ultra ?

Doesn't look like it, at least not the full AVX-512. According to Intel's enablement in the GCC compiler Meteor Lake, Arrow Lake and Lunar Lake are all without AVX-512 capabilities (indicated by lack of P_PROC_AVX512F).

Assimilator · Jul 25, 2023

Six_Times said:
Does this mean AVX512 will be implemented again for Meteor Lake Ultra ?

Nope, scuttlebutt is that this will only make an appearance in consumer CPUs with Lunar Lake, which is supposed to be their next great 15-generation hope for mobile in 2025. Considering 12th gen was meh and 13th and 14th "generations" have no new features, I'd be very surprised if this big a change actually happens on that timeline - especially considering Meteor Lake aka gen14 is a node shrink with chiplets and Intel hasn't said squat about it since January, except for trying to distract with their stupid branding change.

If MTL goes particularly poorly it will never make it to desktop (see also: Cannon Lake, Ice Lake) and that would set these projections back even further. Considering the only possible MTL benchmarks spotted in the wild have been for mobile parts I'm strongly suspecting this is the case, and Intel will just refresh gen13 on desktop and call it gen14, and these "gen14" desktop chips will desperately cram even more cores into an architecture that is long overdue for replacement. So if you thought Intel's CPU power consumption was bad before, hoo boy.

AnotherReader · Jul 25, 2023

Real World Technologies has an interesting discussion regarding this. One of the posts summarizes the changes:

The announcement from today about AVX10 contains only two facts:

1. AVX-512 has been renamed as AVX10

2. A subset of AVX-512 a.k.a. AVX10 has been defined, with 256-bit vector registers and 32-bit mask registers. This subset will be implemented in all future Intel CPUs after some date. The Intel documents describe how the support for this subset will be identified and what implications it will have for operating systems and for the porting of older applications.

In other words, next gen E-cores won't implement AVX-512; rather they will implement AVX10 which keeps the features of AVX-512 but decreases the vector width to 256 bits which is the same as regular AVX.

Assimilator · Jul 25, 2023

AnotherReader said:
Real World Technologies has an interesting discussion regarding this. One of the posts summarizes the changes:

In other words, next gen E-cores won't implement AVX-512; rather they will implement AVX10 which keeps the features of AVX-512 but decreases the vector width to 256 bits which is the same as regular AVX.

That's the entire reason for this rename/rebrand (although I don't know where the "10" comes from): 512-bit instructions are just too large for efficient processing, so Intel wants you to forget they ever introduced AVX512. And honestly given that 256 bits is more than enough for most use cases, AVX10 aka AVX256 should be fine for quite a while.

AnotherReader · Jul 25, 2023

Surprisingly, TPU hasn't covered the biggest change to x86 since Hammer 20 years ago. Some of the more important changes are below:

Intel® APX doubles the number of general-purpose registers (GPRs) from 16 to 32.

Intel® APX adds conditional forms of load, store, and compare/test instructions, and it also adds an option for the compiler to suppress the status flags writes of common instructions

TumbleGeorge · Jul 25, 2023

AnotherReader said:
Surprisingly, TPU hasn't covered the biggest change to x86 since Hammer 20 years ago. Some of the more important changes are below:

Mmm. In article has weblink to same page. When Intel decide to publish press release in TPU I think that all be shown here.

AnotherReader · Jul 25, 2023

TumbleGeorge said:
Mmm. In article has weblink to same page. When Intel decide to publish press release in TPU I think that all be shown here.

My mistake. I read the content about the E cores; the summary missed the APX announcement.

chrcoluk · Jul 25, 2023

Intel is all over the place, they add something, they remove it, then they add again.

Also looks like its AVX 10.2 specifically that they will (re)add to consumer chips.

ncrs · Jul 25, 2023

Assimilator said:
Nope, scuttlebutt is that this will only make an appearance in consumer CPUs with Lunar Lake, which is supposed to be their next great 15-generation hope for mobile in 2025.

Do you have a reliable source for that claim? It's in direct contradiction to what Intel submitted to GCC with Lunar Lake still being at AVX2 level, and not AVX-512F.

Assimilator · Jul 25, 2023

ncrs said:
Do you have a reliable source for that claim? It's in direct contradiction to what Intel submitted to GCC with Lunar Lake still being at AVX2 level, and not AVX-512F.

Like I said, it's rumour... unfortunately I can't find where I read it ATM. If I do I'll post it, but until then feel free to assume I'm making s**t up.

dragontamer5788 · Jul 25, 2023

ncrs said:
Do you have a reliable source for that claim? It's in direct contradiction to what Intel submitted to GCC with Lunar Lake still being at AVX2 level, and not AVX-512F.

"Scuttlebutt" is some weird English-slang for rumor. Speak American dang it! But in any case, "scuttlebutt" is an open admission that there's no reliable source. But its interesting to think about nonetheless.

------------

All in all, this AVX10 and APX all looks like a good plan. But heck, AVX512 was a good plan and good idea overall IMO, just Intel screwed it up royally and somehow AMD's Zen4 implementation is superior.

Assimilator said:
That's the entire reason for this rename/rebrand (although I don't know where the "10" comes from): 512-bit instructions are just too large for efficient processing, so Intel wants you to forget they ever introduced AVX512. And honestly given that 256 bits is more than enough for most use cases, AVX10 aka AVX256 should be fine for quite a while.

Just because the ISA is 512-bit doesn't mean that you have to lay out the fundamental circuit as 512-bit. AMD's 256-bit wide vector cores are executing AVX512 perfectly fine with high performance benefits.

In another example: AMD GCN is 2048-bit ultra-wide 64x32-bit GPU ISA, but was physically implemented with only 16x ALUs (aka; 512-bit wide physical implementation that executed 2048-bit code across 4-clock cycles). Etc. etc.

----------

My expectation is that this APX / AVX10 whatever stuff is good Intel Engineering, and that Intel's horrible management / business side hasn't figured out how to screw it up yet. But in practice, they will screw up this plan somehow.

Minus Infinity · Jul 26, 2023

AMD of course won't have this problem with their dense cores. Scheduling won't be an issue either.

DemonicRyzen666 · Jul 26, 2023

dragontamer5788 said:
"Scuttlebutt" is some weird English-slang for rumor. Speak American dang it! But in any case, "scuttlebutt" is an open admission that there's no reliable source. But its interesting to think about nonetheless.

------------

All in all, this AVX10 and APX all looks like a good plan. But heck, AVX512 was a good plan and good idea overall IMO, just Intel screwed it up royally and somehow AMD's Zen4 implementation is superior.

Just because the ISA is 512-bit doesn't mean that you have to lay out the fundamental circuit as 512-bit. AMD's 256-bit wide vector cores are executing AVX512 perfectly fine with high performance benefits.

In another example: AMD GCN is 2048-bit ultra-wide 64x32-bit GPU ISA, but was physically implemented with only 16x ALUs (aka; 512-bit wide physical implementation that executed 2048-bit code across 4-clock cycles). Etc. etc.

----------

My expectation is that this APX / AVX10 whatever stuff is good Intel Engineering, and that Intel's horrible management / business side hasn't figured out how to screw it up yet. But in practice, they will screw up this plan somehow.

AMD AVX512 is SLOW!

mahirzukic2 · Jul 26, 2023

How come? Any sources?

System Name	The TPU Typewriter
Processor	AMD Ryzen 5 5600 (non-X)
Motherboard	GIGABYTE B550M DS3H Micro ATX
Cooling	DeepCool AS500
Memory	Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s)	PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage	Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s)	Lenovo Legion Y27q-20 27" QHD IPS monitor
Case	GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s)	FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply	ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse	Roccat Kone Pro Air
Keyboard	Cooler Master MasterKeys Pro L
Software	Windows 10 64-bit Home Edition

System Name	ab┃ob
Processor	7800X3D┃5800X3D
Motherboard	B650E PG-ITX┃X570 Impact
Cooling	NH-U12A + T30┃AXP120-x67
Memory	64GB 6400CL32┃32GB 3600CL14
Video Card(s)	RTX 4070 Ti Eagle┃RTX A2000
Storage	8TB of SSDs┃1TB SN550
Case	Caselabs S3┃Lazer3D HT5

System Name	Firelance.
Processor	Threadripper 3960X
Motherboard	ROG Strix TRX40-E Gaming
Cooling	IceGem 360 + 6x Arctic Cooling P12
Memory	8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s)	MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage	2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s)	Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case	Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply	Fractal Design Ion+ 2 Platinum 760W
Mouse	Logitech G604
Keyboard	Razer Pro Type Ultra
Software	Windows 10 Professional x64

Processor	Ryzen 7 5700X
Motherboard	ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling	Noctua NH-C14S (two fans)
Memory	2x16GB DDR4 3200
Video Card(s)	Reference Vega 64
Storage	Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s)	Nixeus NX-EDG27, and Samsung S23A700
Case	Fractal Design R5
Power Supply	Seasonic PRIME TITANIUM 850W
Mouse	Logitech
VR HMD	Oculus Rift
Software	Windows 11 Pro, and Ubuntu 20.04

System Name	Firelance.
Processor	Threadripper 3960X
Motherboard	ROG Strix TRX40-E Gaming
Cooling	IceGem 360 + 6x Arctic Cooling P12
Memory	8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s)	MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage	2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s)	Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case	Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply	Fractal Design Ion+ 2 Platinum 760W
Mouse	Logitech G604
Keyboard	Razer Pro Type Ultra
Software	Windows 10 Professional x64

Intel Previews AVX10 ISA, Next-Gen E-Cores to get AVX-512 Capabilities

T0@st

News Editor

enb141

Six_Times

tabascosauz

Moderator

ncrs

Assimilator

AnotherReader

Assimilator

AnotherReader

TumbleGeorge

AnotherReader

chrcoluk

ncrs

Assimilator

dragontamer5788

Minus Infinity

DemonicRyzen666

mahirzukic2

System Name	Main PC
Processor	13700k
Motherboard	Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling	Noctua NH-D15S
Memory	32 Gig 3200CL14
Video Card(s)	4080 RTX SUPER FE 16G
Storage	1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s)	LG 27GL850
Case	Fractal Define R4
Audio Device(s)	Soundblaster AE-9
Power Supply	Antec HCG 750 Gold
Software	Windows 10 21H2 LTSC

System Name	S.L.I + RTX research rig
Processor	Ryzen 7 5800X 3D.
Motherboard	MSI MEG ACE X570
Cooling	Corsair H150i Cappellx
Memory	Corsair Vengeance pro RGB 3200mhz 32Gbs
Video Card(s)	2x Dell RTX 2080 Ti in S.L.I
Storage	Western digital Sata 6.0 SDD 500gb + fanxiang S660 4TB PCIe 4.0 NVMe M.2
Display(s)	HP X24i
Case	Corsair 7000D Airflow
Power Supply	EVGA G+1600watts
Mouse	Corsair Scimitar
Keyboard	Cosair K55 Pro RGB

System Name	Workhorse
Processor	13900K 5.9 Ghz single core (2x) 5.6 Ghz Allcore @ -0.15v offset / 4.5 Ghz e-core -0.15v offset
Motherboard	MSI Z690A-Pro DDR4
Cooling	Arctic Liquid Cooler 360 3x Arctic 120 PWM Push + 3x Arctic 140 PWM Pull
Memory	2 x 32GB DDR4-3200-CL16 G.Skill RipJaws V @ 4133 Mhz CL 18-22-42-42-84 2T 1.45v
Video Card(s)	RX 6600XT 8GB
Storage	PNY CS3030 1TB nvme SSD, 2 x 3TB HDD, 1x 4TB HDD, 1 x 6TB HDD
Display(s)	Samsung 34" 3440x1400 60 Hz
Case	Coolermaster 690
Audio Device(s)	Topping Dx3 Pro / Denon D2000 soon to mod it/Fostex T50RP MK3 custom cable and headband / Bose NC700
Power Supply	Enermax Revolution D.F. 850W ATX 2.4
Mouse	Logitech G5 / Speedlink Kudos gaming mouse (12 years old)
Keyboard	A4Tech G800 (old) / Apple Magic keyboard