• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Core-1800 Alder Lake Engineering Sample Spotted with 16C/24T Configuration

Joined
Aug 17, 2017
Messages
274 (0.11/day)
The biggest mistake Intel can make for this release is not releasing Alder Lake with DDR5 from the start. I am curious to see if they do or not. If they do get "gutsy" and release with DDR5 from the start I'm in, otherwise no thanks.

So does the ID "Alder Lake-S 881" mean 8 big, 8 little, 1 GPU?
yes.
 
Joined
Jul 5, 2013
Messages
25,559 (6.52/day)
On top of that, read ahead on this version of Windows is abysmal at worst, so they might overhaul Windows entirely of just launch a new version of it.
Very doubtful. There is no need for a new version of Windows as all existing modern versions of Windows have drivers for and know how to use both sets of the CPU's in question.

The biggest mistake Intel can make for this release is not releasing Alder Lake with DDR5 from the start.
Disagreed. DDR4 is perfectly acceptable.
 
Last edited:
Joined
Mar 20, 2019
Messages
556 (0.30/day)
Processor 9600k
Motherboard MSI Z390I Gaming EDGE AC
Cooling Scythe Mugen 5
Memory 32GB of G.Skill Ripjaws V 3600MHz CL16
Video Card(s) MSI 3080 Ventus OC
Storage 2x Intel 660p 1TB
Display(s) Acer CG437KP
Case Streacom BC1 mini
Audio Device(s) Topping MX3
Power Supply Corsair RM750
Mouse R.A.T. DWS
Keyboard HAVIT KB487L / AKKO 3098 / Logitech G19
VR HMD HTC Vive
Benchmark Scores What's a "benchmark"?
Heterogeneous cores?! It's a freeeeak, grab the pitchforks!

The way I see it, it's great for mobile, gives even more flexibility as far as power usage goes. For desktops, well, meh. My old and certainly not the fastest or most power efficient 9700k idles at around 10W (reported "package power") and doesn't really cross 40W during normal work, so the Mugen 5 handles it almost passively. I recently started using the 9600k and it idles even lower, even when running Windows with all the data stealing shenanigans going on in the background.
Now let's just hope that Microsoft can get their lazy asses to work on a reasonable scheduler. Linux hippies figured it out years ago.
 
Joined
Nov 23, 2020
Messages
538 (0.44/day)
Location
Not Chicago, Illinois
System Name Desktop-TJ84TBK
Processor Ryzen 5 3600
Motherboard Asus ROG Strix B350-F Gaming
Cooling ARCTIC Liquid Freezer II 120mm, Noctua NF-F12
Memory B-Die 2x8GB 3200 CL14, Vengeance LPX 2x8GB 3200 CL16, OC'd to 3333 MT/s C16-16-16-32 tRC 48
Video Card(s) PNY GTX 690
Storage Crucial MX500 1TB, MX500 500GB, WD Blue 1TB, WD Black 2TB, WD Caviar Green 3TB, Intel Optane 16GB
Display(s) Sceptre M25 1080p200, ASUS 1080p74, Apple Studio Display M7649 17"
Case Rosewill CRUISER Black Gaming
Audio Device(s) SupremeFX S1220A
Power Supply Seasonic FOCUS GM-750
Mouse Kensington K72369
Keyboard Razer BlackWidow Ultimate 2013
Software Windows 10 Home 64-bit, macOS 11.7.8
Benchmark Scores are good
Now let's just hope that Microsoft can get their lazy asses to work on a reasonable scheduler. Linux hippies figured it out years ago.
Maybe they'll have to do the custom power plans again like they did for Ryzen.

Anyway, for this CPU on desktop.
Idle power consumption isn't a big deal on desktop - 10W vs 15W is basically nothing.

Also, I can't wait for people complaining that "my CPU isn't being used 100%". Gotta add another copy+paste reply to my repository.

Here's my view on big.LITTLE.
It's a great idea for laptops. Less power consumption = less heat = better laptop because you aren't burning people's legs/hands/etc.
Not so much desktops. There's absolutely no point, really.
 
Joined
Jun 10, 2014
Messages
2,889 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Anyway, for this CPU on desktop.
Idle power consumption isn't a big deal on desktop - 10W vs 15W is basically nothing.

Here's my view on big.LITTLE.
It's a great idea for laptops. Less power consumption = less heat = better laptop because you aren't burning people's legs/hands/etc.
Not so much desktops. There's absolutely no point, really.
Totally agree.
But you have to realize that these are primarily designed for the OEM market. Dell, HP, Lenovo etc. will love to sell 16c/24t "5 GHz" 65W TDP CPUs in their tiny boxes with undersized cooling and PSU.

Power users should probably be looking at the HEDT segment anyways, and not just to get more unleashed CPU cores, but also IO like more SSDs etc.
 
Joined
Aug 14, 2009
Messages
216 (0.04/day)
Location
Denmark
System Name Bongfjaes
Processor AMD 3700x
Motherboard Assus Crosshair VII Hero
Cooling Dark Rock Pro 4
Memory 2x8GB G.Skill FlareX 3200MT/s CL14
Video Card(s) GTX 970
Storage Adata SX8200 Pro 1TB + Lots of spinning rust
Display(s) Viewsonic VX2268wm
Case Fractal Design R6
Audio Device(s) Creative SoundBlaster AE-5
Power Supply Seasonic TTR-1000
Mouse Pro Intellimouse
Keyboard SteelKeys 6G
On the other hand, id rather have ddr4 for alder lake due to maturity, ddr5 is not really yet yet, maybe it is in a year or so
 
Joined
Feb 12, 2021
Messages
154 (0.14/day)
Even if they have similar IPC to Skylake, they're gonna run at lower clocks and still be pretty slow. No matter how much you mess with the scheduler the small cores will range from worthless (the scheduler never prioritizes them) to detrimental (the scheduler places the wrong threads on them).

big.LITTLE can only work effectively in low power mobile devices where you're fine with things running sub optimally when the device idles or stuff like that. On a desktop you typically want high performance all the time.

Having stuff like maybe the browser running on the low power cores sounds good but it almost never works like it should. Because how do you know that ? You can do stuff like like maybe target code that only contains 32 bit instructions on the small cores and code that contains SIMD on the big cores but it's complicated and it's not gonna work most of the time because applications mix and match different kinds of workloads.
A lot of your points were talked about by Dr Ian Cutress, the scheduler is going to be a real problem with these new CPU's and will take a while to get ironed out, Ian also talks about various other things as well, interesting and useful video.

 
Joined
Jan 14, 2021
Messages
241 (0.21/day)
System Name Z590 Epiphenomenal Rocket Bench
Processor Intel 11600K i5 binned by Silicon Lottery at 5.0/5.1 - 5.5Ghz ST Benchmark capable - 5.3Ghz Daily OC
Motherboard Asus ROG Maximus XIII Apex
Cooling Noctua NH-P1 Passive/Active Cooler with Noctua 140mm Industrial PWM fan tuned at 750rpm
Memory F4-4800C17D-16GTRS @ 5066MHz 17 17 37 1T Daily / F4-5333C22D-16GTES @ 5600Mhz 20 32 52 1T Daily
Video Card(s) Nvidia RTX A2000 workstation card (Coming Soon)
Storage Intel Optane PCIe Add In Card 280GB SSD / WD_Black SN850 500GB PCIe 4.0 NVMe SSD / (2) 860 Pro SATA
Display(s) LG 32in 2560x1440 32QN650-B
Case Primochill WetBench SX Pro test bench
Audio Device(s) Creative Pebble V2
Power Supply Seasonic TX-700 fanless titanium
Mouse Logitech MX Master 3 graphite / ROG Gladius II Origin
Keyboard HHKB Hybrid Type-S black
Software Work and statistical apps
Benchmark Scores 5.5Ghz Single Thread Cinebench R15 - 268 CPUZ ST – 704
Wondering how large the Alder Lake DDR5 latency penalty will be?

RKL is about 10ns - I'll be very happy to get an RKL DDR4 AIDA 64 latency timing somewhere within the 40ns - 50ns range. :ohwell:

Right now, I'm really loving the 11600K (moving from an 8086K) just discovering it's potential, the IPC increase 8th gen to 11th gen is extremely apparent - so I can wait for Raptor Lake next year and give DDR5 some time to mature, before buying.
 
Last edited:
Joined
Feb 21, 2006
Messages
1,970 (0.30/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5003 AM4 AGESA V2 PI 1.2.0.B
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) AMD Radeon RX 7900 XTX 24GB (24.3.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 14TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c
Wondering how large the Alder Lake DDR5 latency penalty will be?

RKL is about 10ns - I'll be very happy to get an RKL DDR4 AIDA 64 latency timing somewhere within the 40ns - 50ns range. :ohwell:

Think I'll wait for Raptor Lake next year and give DDR5 some time to mature, before buying.

I would be more concerned about actual performance than AIDA 64 latency numbers.

Comet Lake has better latency than Zen 3 in Aida yet is slower. And with all first gen memory it will most likely be slower than DDR4 at the start. The main thing of DDR5 is to bring more bandwidth.
 
Joined
Jan 14, 2021
Messages
241 (0.21/day)
System Name Z590 Epiphenomenal Rocket Bench
Processor Intel 11600K i5 binned by Silicon Lottery at 5.0/5.1 - 5.5Ghz ST Benchmark capable - 5.3Ghz Daily OC
Motherboard Asus ROG Maximus XIII Apex
Cooling Noctua NH-P1 Passive/Active Cooler with Noctua 140mm Industrial PWM fan tuned at 750rpm
Memory F4-4800C17D-16GTRS @ 5066MHz 17 17 37 1T Daily / F4-5333C22D-16GTES @ 5600Mhz 20 32 52 1T Daily
Video Card(s) Nvidia RTX A2000 workstation card (Coming Soon)
Storage Intel Optane PCIe Add In Card 280GB SSD / WD_Black SN850 500GB PCIe 4.0 NVMe SSD / (2) 860 Pro SATA
Display(s) LG 32in 2560x1440 32QN650-B
Case Primochill WetBench SX Pro test bench
Audio Device(s) Creative Pebble V2
Power Supply Seasonic TX-700 fanless titanium
Mouse Logitech MX Master 3 graphite / ROG Gladius II Origin
Keyboard HHKB Hybrid Type-S black
Software Work and statistical apps
Benchmark Scores 5.5Ghz Single Thread Cinebench R15 - 268 CPUZ ST – 704
I would be more concerned about actual performance than AIDA 64 latency numbers.

Comet Lake has better latency than Zen 3 in Aida yet is slower. And with all first gen memory it will most likely be slower than DDR4 at the start. The main thing of DDR5 is to bring more bandwidth.

Agree 100%.

I'm kinda a lover of low latency and track racer responsiveness at low Qdepth, thus the Optane SSD in my build and my work apps never exceed 9threads maximum (light load). :)

So 6cores 12faster threads offer more for my workflow. "I have no problem buying an i5 when Intel doesn't bring their A-game"

Loving the 11600K Air-Cooled. :love:

------

AMD is doing such an amazing job with IPC, my hat is off to them, this is an amazing and exciting time for CPU development.

I'm all in for both camps red and blue! :)

IMG_2371.JPG


5.3GHz 11600K.jpg
 
Last edited:
Joined
Jun 10, 2014
Messages
2,889 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I would be more concerned about actual performance than AIDA 64 latency numbers.
Yes, synthetics are for technical discussions.
Buying decisions on the other hand should be dictated by real world performance.

I haven't studied the differences in the signaling protocols between DDR4 and DDR5, and all the various latencies involved, but I believe it doubles the banks. So there might be access patterns which are faster and some that are slower. Time will tell.

I'm actually more concerned about price and availability. What will a lot of you do if DDR5 is scarce when Alder Lake ships?
 
Joined
Feb 21, 2006
Messages
1,970 (0.30/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5003 AM4 AGESA V2 PI 1.2.0.B
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) AMD Radeon RX 7900 XTX 24GB (24.3.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 14TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c
Yes, synthetics are for technical discussions.
Buying decisions on the other hand should be dictated by real world performance.

I haven't studied the differences in the signaling protocols between DDR4 and DDR5, and all the various latencies involved, but I believe it doubles the banks. So there might be access patterns which are faster and some that are slower. Time will tell.

I'm actually more concerned about price and availability. What will a lot of you do if DDR5 is scarce when Alder Lake ships?
It being scarce will not be a problem for me personally as I don't intend on upgrading my Rig any time soon. When that time arrives I will not be looking at all the first gen products. For someone looking to build a new rig in the coming months it will be a concern I guess.
 
Joined
May 3, 2018
Messages
2,232 (1.03/day)
I'm not sure anyone should be rushing out to get first gen DDR5 if first gen DDR4 was anything to go on. Those original DDR4 chips had pretty bad latencies and it took a good year or so to start getting memory that was noticeably better than the best of the fastest DDR3. Also there will a huge premium for early adapters most likely.

I'm sure Alder Lake will be competitive especially with it's promised 100% multithreaded performance uplift (I presume compared to Skylake) and the Gracemont cores are pretty good according to Moore's Law is Dead, about 2/3rds the peformance of Skylake, The PL2 power state seems rather poor though, was expecting much better, but let's wait and see.
 
Joined
Feb 1, 2019
Messages
2,515 (1.34/day)
Location
UK, Leicester
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 3080 RTX FE 10G
Storage 1TB 980 PRO (OS, games), 2TB SN850X (games), 2TB DC P4600 (work), 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Asus Xonar D2X
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
They can handle it, I just pointed out that the best they can do is prevent the small cores from tanking performance.


It doesn't take an army of engineers to know that there is no "correct" solution to this. And you're making a wrong assumption here, even if the engineers know better the end product can still be a failure. I am sure the engineers knew how to build a better processor back in the day when they came up with Netburst but the end result was obviously terrible because the upper management wanted a marketable product with more Ghz on the box than the competition. See, it's not that simple.

I feel like this is the exact same situation, I suspect that the engineers know that this architecture makes no sense on a desktop but the management wants a marketable product with many cores because the competition is totally crushing them in that department.
I think you have nailed it, it seems a marketing solution not an engineering one, even on phones it doesnt really work well, but kind of just accepted as there is a recognition of the constraints been worked with and the need to stretch out battery life. Usually on my phone I root it and adjust scheduler to stop using small cores.

I expect on the PC we will get people trying to disable the small cores as much as possible. Might even be affected by profile so e.gl in "high performance" profile it doesnt schedule anything to small cores.
 
Joined
Sep 28, 2012
Messages
963 (0.23/day)
System Name Poor Man's PC
Processor AMD Ryzen 5 7500F
Motherboard MSI B650M Mortar WiFi
Cooling ID Cooling SE 206 XT
Memory 32GB GSkill Flare X5 DDR5 6000Mhz
Video Card(s) Sapphire Pulse RX 6800 XT
Storage XPG Gammix S70 Blade 2TB + 8 TB WD Ultrastar DC HC320
Display(s) Mi Gaming Curved 3440x1440 144Hz
Case Cougar MG120-G
Audio Device(s) MPow Air Wireless + Mi Soundbar
Power Supply Enermax Revolution DF 650W Gold
Mouse Logitech MX Anywhere 3
Keyboard Logitech Pro X + Kailh box heavy pale blue switch + Durock stabilizers
VR HMD Meta Quest 2
Benchmark Scores Who need bench when everything already fast?
Very doubtful. There is no need for a new version of Windows as all existing modern versions of Windows have drivers for and know how to use both sets of the CPU's in question.

Given that Intel is closer to Microsoft rather than AMD, and also considering that Windows 5 year cycle has overdue, I still think its possible. Microsoft also baked their own in-house chip based on ARM, and I don't think they're reckless enough to not provide an OS that natively supports it.
 
Joined
Jan 3, 2021
Messages
2,593 (2.20/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Microsoft also baked their own in-house chip based on ARM, and I don't think they're reckless enough to not provide an OS that natively supports it.
If they are any clever, they're now using that sur-snap-face-dragon as a great learning tool, and they will figure out what (mostly) proper scheduling looks like by 2022.

By software(?) emulation I presume you mean that the CPU frontend will translate it into different instructions (hardware emulation), which is what modern x86 microarchitectures does already; all FPU, MMX, SSE instructions are converted to AVX. This is also how legacy instructions are implemented.

But there will be challenges when there isn't a binary compatible translation, e.g. FMA operations. Doing these separately will result in rounding errors. There are also various other shuffling etc. operations in AVX which will require a lot of instructions to achieve.
I mean emulation in software. The basic mechanism exists in all modern CPUs: if the decoder encounters an unknown instruction, an interrupt is triggered and the interrupt handler can do calculations instead of that instruction. Obviously, AVX-512 registers have to be replaced by data structures in memory. That's utterly inefficient but would prevent the thread and the process from dying.
In such cases I do wonder if the CPU will just freeze the thread and ask the scheduler to move it, because this detection has to happen on the hardware side.
Is that possible in today's CPUs?
One additional aspect to consider, is that Linux distributions are moving to shipping versions where the entire software repositories are compiled with e.g. AVX2 optimizations, so virtually nothing can use the weak cores, so clearly Intel made a really foolish move here.
Small cores are supposed to have AVX2 (but it's still a guess).
Windows and the default Linux kernel have very little x86 specific code, and even less specific to particular microarchitectures. While you certainly can compile your own Linux kernel with a different scheduler, compile time arguments and CPU optimizations, this is something you have to do yourself and keep repeating every time you want kernel patches.

So with a few exceptions, the OS schedulers are running mostly generic code.
They do however as the dragon tamer said in your link, do a lot of heuristics and adjustments in runtime, including moving threads around for distributing heat. Whether these algorithms are "optimal" or not depends on the workload.
I certainly don't understand much of the description of Linux CFS (this one or others) but it seems to be pretty much configurable, with all those "domains" and "groups" and stuff. The code itself can be universal but can still account for specifics by means of parameters, like those than can be obtained by the cpuinfo command.
We'll see if this changes when Intel and AMD releases hybrid designs, you better prepare for a bumpy ride.
Yes sir. I just believe there will be a smooth ride after a period of bumpy ride. (And later, some turbulent flight when MS inevitably issues a Windows update that's been thoroughly alpha-tested.)

I think you have nailed it, it seems a marketing solution not an engineering one
The engineering (and business) decision that we see in every generation of CPUs is to have as few variants of silicon as possible. It looks like the design and validation and photomask production and such things, those that need to be done repeatedly for each variant and each stepping, are horribly expensive. So Intel may decide to bake only two variants, for example, 8 big + 8 small + IGP and 4 big + 8 small + IGP, and break down these two into a hundred different laptop and desktop chips.
I expect on the PC we will get people trying to disable the small cores as much as possible. Might even be affected by profile so e.gl in "high performance" profile it doesnt schedule anything to small cores.
Who knows, we may even get a BIOS option to disable them.
 
Joined
Jun 10, 2014
Messages
2,889 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I mean emulation in software. The basic mechanism exists in all modern CPUs: if the decoder encounters an unknown instruction, an interrupt is triggered and the interrupt handler can do calculations instead of that instruction.
I'm skeptical about the feasibility of this.

Obviously, AVX-512 registers have to be replaced by data structures in memory. That's utterly inefficient but would prevent the thread and the process from dying.
Actually, that's the one part that's trivial.
You shouldn't need to change the memory at all. AVX data is just packed floats or ints, so if you can split the AVX operation up into e.g. 16 individual ADD/SUB/MUL/DIV operations, you can just use a pointer with an offset.

The real challenge with your approach is to inject the replacement code. Machine code works with pointer addresses, so if you add more instructions in the middle all addresses would have to be offset. Plus there could be side effects from the usage of registers in the injected code. So I'm not convinced about your approach.

Is that possible in today's CPUs?
Perhaps. Currently, if a CPU encounters an invalid opcode, the thread is normally terminated. I haven't studied what happens on the low level if it's possible for the OS to move it before the cleanup.
 
Joined
Jan 3, 2021
Messages
2,593 (2.20/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
I'm skeptical about the feasibility of this.

Actually, that's the one part that's trivial.
You shouldn't need to change the memory at all. AVX data is just packed floats or ints, so if you can split the AVX operation up into e.g. 16 individual ADD/SUB/MUL/DIV operations, you can just use a pointer with an offset.

The real challenge with your approach is to inject the replacement code. Machine code works with pointer addresses, so if you add more instructions in the middle all addresses would have to be offset. Plus there could be side effects from the usage of registers in the injected code. So I'm not convinced about your approach.
It has nothing to do with translation or replacement of code. An invalid opcode triggers an exception #6: Invalid opcode, and an exception handler is then run (in the OS kernel, I presume). The exception handler saves the registers, then it reads the offending instruction and its parameters. If it recognizes an AVX-512 instruction, it performs the operations that this instruction should perform. It doesn't operate on AVX registers because there are none but rather on 32 x 512 bits of data stored in memory, which is not part of the user process space. The exception handler then restores the values of registers and returns control to the user process, which then continues instead of being terminated.
I can find very few resources on that (Anand forums, MIT courses) ... it doesn't seem to be very common.
 
Joined
Jan 14, 2019
Messages
9,725 (5.12/day)
Location
Midlands, UK
System Name Nebulon-B Mk. 4
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance EXPO DDR5-6000
Video Card(s) Sapphire Pulse Radeon RX 7800 XT
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2, 4 + 8 TB Seagate Barracuda 3.5"
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Windows 10 Pro
Benchmark Scores Cinebench R23 single-core: 1,800, multi-core: 18,000. Superposition 1080p Extreme: 9,900.
I think you have nailed it, it seems a marketing solution not an engineering one, even on phones it doesnt really work well, but kind of just accepted as there is a recognition of the constraints been worked with and the need to stretch out battery life. Usually on my phone I root it and adjust scheduler to stop using small cores.

I expect on the PC we will get people trying to disable the small cores as much as possible. Might even be affected by profile so e.gl in "high performance" profile it doesnt schedule anything to small cores.
To be honest, I don't really need large cores in my phone as I'm only using it to check my messages every now and then. I'm really not into the modern smartphone gaming / social media culture.

Similarly, I don't need small cores in my PC. I only need large cores with decent power management to keep temperatures in check.

Big.LITTLE is a waste of die area in all platforms in my opinion.
 
Top