• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Itanium Reaches End of the Road with Linux Kernel Stopping Updates

Joined
Jan 2, 2019
Messages
61 (0.03/day)
Location
Calgary, Canada
And did that automatically do the SIMD scheduling that was the only thing that could have made Itanium fast?

Yes. For Intel for IA-64 compiler default options set to create as fastest as possible binaries. For example:
...
/O2 optimize for maximum speed (DEFAULT)
...
/Qvec[-] enables(DEFAULT)/disables vectorization
...

OpenMP was the easiest way to enable parallelization ( in our codes almost all for-loops have #pragma omp ... directives ):
...
/Qopenmp enable the compiler to generate multi-threaded code based on the OpenMP* directives
...

Auto-Parallelization was also available (!):
...
/Qparallel enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel
...

SIMD-like features for explicit application of vectorization was also available from fvec.h and dvec.h:
...
const union
{
int i[4];
__m128d m;
} __f64vec2_abs_mask_cheat = {0xffffffff, 0x7fffffff, 0xffffffff, 0x7fffffff};

#define _f64vec2_abs_mask ((F64vec2)__f64vec2_abs_mask_cheat.m)

/* EMM Functionality Intrinsics */

class I8vec16; /* 16 elements, each element a signed or unsigned char data type */
class Is8vec16; /* 16 elements, each element a signed char data type */
class Iu8vec16; /* 16 elements, each element an unsigned char data type */
class I16vec8; /* 8 elements, each element a signed or unsigned short */
class Is16vec8; /* 8 elements, each element a signed short */
class Iu16vec8; /* 8 elements, each element an unsigned short */
class I32vec4; /* 4 elements, each element a signed or unsigned long */
class Is32vec4; /* 4 elements, each element a signed long */
class Iu32vec4; /* 4 elements, each element a unsigned long */
class I64vec2; /* 2 element, each a __m64 data type */
class I128vec1; /* 1 element, a __m128i data type */
...
 
Last edited:
Joined
Nov 11, 2020
Messages
427 (0.33/day)
Location
Earth, Solar System
Processor AMD Ryzen 7 5700X
Motherboard Asus TUF Gaming B550M-Plus (Wi-Fi)
Cooling Thermalright PA120 SE; Arctic P12, F12
Memory Crucial BL8G32C16U4W.M8FE1 ×2
Video Card(s) Sapphire Nitro+ RX 6600 XT
Storage Kingston SKC3000D/2048G; Samsung MZVLB1T0HBLR-000L2; Seagate ST1000DM010-2EP102
Display(s) AOC 24G2W1G4
Case Sama MiCube
Audio Device(s) Somic G923
Power Supply EVGA 650 GD
Mouse Logitech G102
Keyboard Logitech K845 TTC Brown
Software Windows 10 Pro 1903, Dism++, CCleaner
Benchmark Scores CPU-Z 17.01.64: 3700X @ 4.6 GHz 1.3375 V scoring 557/6206; 760K @ 5 GHz 1.5 V scoring 292/964
Ah, I thought ia64 has been dead for a long time...
 
Joined
Mar 18, 2023
Messages
613 (1.43/day)
System Name Never trust a socket with less than 2000 pins
Yes. For Intel for IA-64 compiler default options set to create as fastest as possible binaries. For example:
...
/O2 optimize for maximum speed (DEFAULT)
...
/Qvec[-] enables(DEFAULT)/disables vectorization
...

OpenMP was the easiest way to enable parallelization ( in our codes almost all for-loops have #pragma omp ... directives ):
...
/Qopenmp enable the compiler to generate multi-threaded code based on the OpenMP* directives
...

Auto-Parallelization was also available (!):
...
/Qparallel enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel
...

SIMD-like features for explicit application of vectorization was also available from fvec.h and dvec.h:
...

Yeah, but again, it doesn't introduce SIMD instructions when compiling code that it not using explicit vectorization. So any random program you compile on there is slow unless you put serious work into it.

And automatically threading is not specific to Itanium.
 
Joined
Aug 20, 2007
Messages
20,827 (3.40/day)
System Name Pioneer
Processor Ryzen R9 7950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage 2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11
It's not just a different architecture, they literally made closed so no backward compatibility at all. The market was like yea ok bro.
That's literally just a different architecture.

Anyways, an emulation layer existed but sucked IIRC.

Did Linux ever drop support for an entire processor architecture before?
Yep. Alpha comes to mind. PA-RISC as well.
 
Last edited:
Joined
Nov 3, 2011
Messages
690 (0.15/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H115i Elite Capellix XT
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD
Display(s) Acer Predator X32FP 32in 160Hz 4K IPS FreeSync/GSync DP, LG 27UL600 27in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
Can you imagine what the computing landscape would have looked like if Intel and HP got away with it?
HP as secondary IA-64 CPU source is not competitive just like their PA-RISC.

Itanium supported big-endian mode for HP's 68K and PA-RISC Unix product lines. Itanium supports both x86's little endian and 68K/PA-RISC's big endian modes.

ARMv8 can run in big endian mode and proven to run big endian mode with Amiga related PiStorm Emu68 (hypervisor level 68K to ARM JIT translator).
 
Joined
Jan 3, 2021
Messages
2,769 (2.25/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Itanium supported big-endian mode for HP's 68K and PA-RISC Unix product lines. Itanium supports both x86's little endian and 68K/PA-RISC's big endian modes.

ARMv8 can run in big endian mode and proven to run big endian mode with Amiga related PiStorm Emu68 (hypervisor level 68K to ARM JIT translator).
Is that important? I mean, is there a significant overhead if a big-endian CPU emulates a little-endian CPU (or vice versa) and has to reorder the bytes?
 
Joined
Aug 20, 2007
Messages
20,827 (3.40/day)
System Name Pioneer
Processor Ryzen R9 7950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage 2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11
Is that important? I mean, is there a significant overhead if a big-endian CPU emulates a little-endian CPU (or vice versa) and has to reorder the bytes?
I'm pretty sure that's a big emu penalty yes.
 
Joined
Nov 3, 2011
Messages
690 (0.15/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H115i Elite Capellix XT
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD
Display(s) Acer Predator X32FP 32in 160Hz 4K IPS FreeSync/GSync DP, LG 27UL600 27in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
Is that important? I mean, is there a significant overhead if a big-endian CPU emulates a little-endian CPU (or vice versa) and has to reorder the bytes?

For HP UX, a native big endian mode is one less translation overhead.

Most Linux ARM and MacOS ARM builds are a little-endian like the X86. X86 can translate big endian into little endian with a 486-era BSWAP instruction.
 
Top