Wednesday, May 5th 2021

Intel Core-1800 Alder Lake Engineering Sample Spotted with 16C/24T Configuration

Intel's upcoming Alder Lake generation of processors is going to be the first iteration of heterogeneous x86 architecture. That means that Intel will for the first time combine smaller, low-power cores, with some big high-performance cores to provide the boost to all the workloads. If a task doesn't need much power, as some background task, for example, the smaller cores are used. And if you need to render something or you want to fire up a game, big cores are used to provide the power needed for the tasks. Intel has decided to provide such an architecture on the advanced 10 nm SuperFin, which represents a major upgrade over the existing 14 nm process.

Today, we got some information from Igor's Lab, showing the leaked specification of the Intel Core-1800 processor engineering sample. While this may not represent the final name, we see that the leaked information shows that the processor is B0 stepping. That means that the CPU will see more changes when the final sample arrives. The CPU has 16 cores with 24 threads. Eight of those cores are big ones with hyperthreading, while the remaining 8 are smaller Atom cores. They are running at the base clock of 1800 MHz, while the boost speeds are 4.6 GHz with two cores, 4.4 GHz with four cores, and 4.2 GHz with 6 cores. When all cores are used, the boost speed is locked at 4.0 GHz. The CPU has a PL1 TDP of 125 Watts, while the PL2 configuration boosts the TDP to 228 Watts. The CPU was reportedly running at 1.3147 Volts during the test. You can check out the complete datasheet below.
Sources: Igor's LAB, via VideoCardz
Add your own comment

46 Comments on Intel Core-1800 Alder Lake Engineering Sample Spotted with 16C/24T Configuration

#1
Melvis
Welcome to 2019 Intel!

Atom cores? hmmmm Not sure How I feel about that...
Posted on Reply
#2
tabascosauz
Melvis
Welcome to 2019 Intel!

Atom cores? hmmmm Not sure How I feel about that...
I do remember reading somewhere that in non-AVX tasks the new Atoms should be approaching Skylake IPC, which really isn't bad at all. Obviously, if you need the throughput of a 5950X, Intel still won't be competitive (and I don't think they're aiming to be).

Bigger issue here is that we've only had one prior preview of this big-little setup in Lakefield and it was pathetic, because Intel treated the big core as an addon to the Atoms. Say what you will about the exponential difference in TDP between Lakefield and Alder Lake etc. but regardless, Alder Lake both needs to be much more appreciative of its Golden Cove cores, and the Windows scheduler also needs to accommodate that and avoid a repeat of Lakefield.

Even though it doesn't do big-little, Ryzen has also taken on some of these ideas as its scheduler has improved over time. Dual CCD Ryzen 5000 segregates the two groups of cores pretty clearly in practice. Even though CCD1 always has the 2 CPPC priority cores, the Ryzens seem to offload a surprising amount of light background processing to one or two preferred CCD2 cores when lightly loaded.
Posted on Reply
#4
Vya Domus
tabascosauz
I do remember reading somewhere that in non-AVX tasks the new Atoms should be approaching Skylake IPC, which really isn't bad at all. Obviously, if you need the throughput of a 5950X, Intel still won't be competitive (and I don't think they're aiming to be).

Bigger issue here is that we've only had one prior preview of this big-little setup in Lakefield and it was pathetic, because Intel treated the big core as an addon to the Atoms. Say what you will about the exponential difference in TDP between Lakefield and Alder Lake etc. but regardless, Alder Lake both needs to be much more appreciative of its Golden Cove cores, and the Windows scheduler also needs to accommodate that and avoid a repeat of Lakefield.

Even though it doesn't do big-little, Ryzen has also taken on some of these ideas as its scheduler has improved over time. Dual CCD Ryzen 5000 honestly doesn't behave too differently outside of heavy all-core. Even though CCD1 always has the 2 CPPC priority cores, the Ryzens already significantly discriminate between the 2 CCDs and seem to offload a surprising amount of light background processing to one or two preferred CCD2 cores when lightly loaded.
Even if they have similar IPC to Skylake, they're gonna run at lower clocks and still be pretty slow. No matter how much you mess with the scheduler the small cores will range from worthless (the scheduler never prioritizes them) to detrimental (the scheduler places the wrong threads on them).

big.LITTLE can only work effectively in low power mobile devices where you're fine with things running sub optimally when the device idles or stuff like that. On a desktop you typically want high performance all the time.

Having stuff like maybe the browser running on the low power cores sounds good but it almost never works like it should. Because how do you know that ? You can do stuff like like maybe target code that only contains 32 bit instructions on the small cores and code that contains SIMD on the big cores but it's complicated and it's not gonna work most of the time because applications mix and match different kinds of workloads.
Posted on Reply
#5
lexluthermiester
Intel might be on to something with this. 8 Atom cores would provide solid performance for general computing tasks and the 8 Core-i cores would kick in for more demanding tasks like gaming, Audio/video work other such demanding work. I would love to see @W1zzard's take and review of this...
Melvis
Atom cores? hmmmm Not sure How I feel about that...
It'll be interesting to see what the actual performance is.
Posted on Reply
#6
napata
Vya Domus
Even if they have similar IPC to Skylake, they're gonna run at lower clocks and still be pretty slow. No matter how much you mess with the scheduler the small cores will range from worthless (the scheduler never prioritizes them) to detrimental (the scheduler places the wrong threads on them).

big.LITTLE can only work effectively in low power mobile devices where you're fine with things running sub optimally when the device idles or stuff like that. On a desktop you typically want high performance all the time.

Having stuff like maybe the browser running on the low power cores sounds good but it almost never works like it should. Because how do you know that ? You can do stuff like like maybe target code that only contains 32 bit instructions on the small cores and code that contains SIMD on the big cores but it's complicated and it's not gonna work most of the time because application mix and match different kinds of workloads.
I assume they'll be able to handle these problems when it comes to scheduling. I mean I doubt you somehow know better than the engineers at Intel & AMD?
Posted on Reply
#7
tabascosauz
napata
I assume they'll be able to handle these problems when it comes to scheduling. I mean I doubt you somehow know better than the engineers at Intel & AMD?
We also "assumed" in 2019 that the Windows scheduler would be properly set up to handle Matisse, which was a novel product at the time. Look how Q2-Q4 2019 went for us.

The scheduler needs time, and the purported release schedule for Alder Lake doesn't allow for a lot of that. Intel is better positioned than AMD to have MS listen to its needs, but the Rocket Lake launch clearly proved that Intel isn't always 100% prepared on microcode either.

Even though Lakefield might technically have served as Intel's first foray into hybrid arch, if Intel simply follows Lakefield's philosophy then like I said Alder Lake is going to be one hell of a ride.
Posted on Reply
#8
Zyll Goliath
lexluthermiester
Intel might be on to something with this. 8 Atom cores would provide solid performance for general computing tasks and the 8 Core-i cores would kick in for more demanding tasks like gaming, Audio/video work other such demanding work. I would love to see @W1zzard's take and review of this...


It'll be interesting to see what the actual performance is.
I agree...this actually maybe is not bad idea at all if they figure how to share the load properly between those different cores.....
Posted on Reply
#9
Vya Domus
napata
I assume they'll be able to handle these problems when it comes to scheduling.
They can handle it, I just pointed out that the best they can do is prevent the small cores from tanking performance.
napata
I mean I doubt you somehow know better than the engineers at Intel & AMD?
It doesn't take an army of engineers to know that there is no "correct" solution to this. And you're making a wrong assumption here, even if the engineers know better the end product can still be a failure. I am sure the engineers knew how to build a better processor back in the day when they came up with Netburst but the end result was obviously terrible because the upper management wanted a marketable product with more Ghz on the box than the competition. See, it's not that simple.

I feel like this is the exact same situation, I suspect that the engineers know that this architecture makes no sense on a desktop but the management wants a marketable product with many cores because the competition is totally crushing them in that department.
Posted on Reply
#10
ZoneDymo
is this not old news? we already knew it would max out at 8 (big) cores with hyper threating and then 8 (little) cores making a total of 24 threads.

I just wonder how CPU-Z and Task Manager will report on it, really Task Manager should sepparate it in 3 sets of graphs for each.
Posted on Reply
#11
Zyll Goliath
ZoneDymo
is this not old news? we already knew it would max out at 8 (big) cores with hyper threating and then 8 (little) cores making a total of 24 threads.

I just wonder how CPU-Z and Task Manager will report on it, really Task Manager should sepparate it in 3 sets of graphs for each.
I guess we knew they working on it but now we know that the work is complete..........
P.S. It will be reported as a Freak :D
Posted on Reply
#12
LemmingOverlord
So does the ID "Alder Lake-S 881" mean 8 big, 8 little, 1 GPU?
Posted on Reply
#13
efikkan
Vya Domus
Even if they have similar IPC to Skylake, they're gonna run at lower clocks and still be pretty slow. No matter how much you mess with the scheduler the small cores will range from worthless (the scheduler never prioritizes them) to detrimental (the scheduler places the wrong threads on them).

big.LITTLE can only work effectively in low power mobile devices where you're fine with things running sub optimally when the device idles or stuff like that. On a desktop you typically want high performance all the time.
Unless Windows (and Linux) chooses to only delegate background services to the slow cores, there is bound to be a lot of cases of sub-optimal performance. And while synthetic workloads are likely to look impressive, there could be a lot of edge cases causing both lag and sustained poor performance.

Besides the point that I think hybrid cores are a nonsense feature on desktops, I think Intel made a huge mistake by having different ISA support on the cores. It would be much better if the slow had all the same features, including AVX-512, but implemented them using more clock cycles if needed. Then the scheduler wouldn't have to worry, and only use heuristics to move around threads. I haven't yet found details on how schedulers will determine whether an application will use AVX or not.
Vya Domus
Having stuff like maybe the browser running on the low power cores sounds good but it almost never works like it should. Because how do you know that ? You can do stuff like like maybe target code that only contains 32 bit instructions on the small cores and code that contains SIMD on the big cores but it's complicated and it's not gonna work most of the time because applications mix and match different kinds of workloads.
Pretty much any user application would have to run on the fast cores, especially the web browser. Otherwise the user experience will be painfully slow.
napata
I assume they'll be able to handle these problems when it comes to scheduling. I mean I doubt you somehow know better than the engineers at Intel & AMD?
The engineers at Intel & AMD don't write the scheduler of Windows.
Also remember that this scheduler has to be flexible enough to work with all the supported microarchitectures.
Posted on Reply
#14
Wirko
efikkan
Besides the point that I think hybrid cores are a nonsense feature on desktops
It's obvious that Alder Lake is designed for notebooks first and foremost, and desktops chips come as an afterthought. I expect desktop CPUs to come with fewer small cores, or none.
efikkan
I think Intel made a huge mistake by having different ISA support on the cores. It would be much better if the slow had all the same features, including AVX-512, but implemented them using more clock cycles if needed. Then the scheduler wouldn't have to worry, and only use heuristics to move around threads. I haven't yet found details on how schedulers will determine whether an application will use AVX or not.
A core can resort to software emulation when it encounters an unknown instruction. So a thread that runs on a small core and uses AVX doesn't need to crash; but the scheduler needs to detect such a situation quickly and move the thread to a large core.
efikkan
Pretty much any user application would have to run on the fast cores, especially the web browser. Otherwise the user experience will be painfully slow.
Browsers are multithreaded too. Of course, someone (developers? compiler? scheduler?) needs to know which tasks may be offloaded to slower cores.
efikkan
The engineers at Intel & AMD don't write the scheduler of Windows.
I'm wondering to what extent this is true. Intel, AMD and Arm engineers certainly have a lot of input in the development process. As Intel is a major contributor to the Linux project, I'm sure they are busy tweaking its scheduler all the time, too.
efikkan
Also remember that this scheduler has to be flexible enough to work with all the supported microarchitectures.
Absolutely, but it also needs to be aware of the peculiarities of each architecture. It also needs to be very fast, which limits its flexibility and the ability to gather runtime statistics, do complex calculations and adapt. We had a similar discussion about Zen 5 recently, here.
Posted on Reply
#15
docnorth
LemmingOverlord
So does the ID "Alder Lake-S 881" mean 8 big, 8 little, 1 GPU?
Makes sense.
Posted on Reply
#16
Turmania
I'm more focused on power consumption numbers. 200w cpu is not a winner in any case.
Posted on Reply
#17
AusWolf
"Intel has decided to provide such an architecture on the advanced 10 nm SuperFin, which represents a major upgrade over the existing 14 nm process."

125 W PL1, while PL2 is up in the sky again. If it was really such a major upgrade, we'd se way over 5 GHz turbo speeds with this power consumption... unless IPC has drastically increased over Rocket Lake, which I highly doubt. Even in that case, I'd much rather see more modest clock speeds with more realistic power targets (and no PL2 - nobody plays games or does any work in 56-second intervals). So far, I'm not impressed.
Posted on Reply
#18
Hardware Geek
I'm reserving judgement on these until they actually release. In theory, it sounds like a good idea, at least when it comes to mobile processors, but it really depends on how the operating system uses the cores.
Posted on Reply
#19
AusWolf
efikkan
Unless Windows (and Linux) chooses to only delegate background services to the slow cores, there is bound to be a lot of cases of sub-optimal performance. And while synthetic workloads are likely to look impressive, there could be a lot of edge cases causing both lag and sustained poor performance.
Having 2-4 small cores dedicated for background tasks might make sense, but this thing has 8 of them. They must be of some use to the end user too, otherwise they're just a waste of die area (which I think they are anyway, but we'll see).
Posted on Reply
#20
DeathtoGnomes
efikkan
Unless Windows (and Linux) chooses to only delegate background services to the slow cores, there is bound to be a lot of cases of sub-optimal performance. And while synthetic workloads are likely to look impressive, there could be a lot of edge cases causing both lag and sustained poor performance.
I highly doubt windows will be up to the task at first. I fear m$/windows will come out with profiles to separate Intel and AMD chips/sets and that the adjustments/allocations meant this glued-together chip only will bleed over into AMD side of things and end up tanking its performance. There is no cure incompetence, it took the windows team how long? to address the performance of threadrippers chip management.
Posted on Reply
#21
1d10t
So basically this is a 9900K with addition of eight N6005 cores.
Barely know anything but latency could be atrocious, unless Intel push Microsoft in their bidding to be aware of smaller cores. On top of that, read ahead on this version of Windows is abysmal at worst, so they might overhaul Windows entirely of just launch a new version of it.

-= edited=-
That 10 nm SuperFin really less appealing when compared to regular 12c24t or 16c32t at 14nm.

Posted on Reply
#22
TheinsanegamerN
Having small cores for non demanding tasks like video consumption or web browsing is a great idea for devices sensitive to power draw, likes phones and tablets.

But with windows machines, you're talking a fraction of the power it uses. Even with the atom cores these alder lake chips are going to suck down power on the desktop, and in mobile AMD's arch is still better, and zen 4 is going to pull further ahead.

This is a solution to a problem nobody wanted. If they could produce atom cores with skylake performance, they could have just made atom processors that performed as well as regular coffee lake CPUs without the huge power draw. The fact they didnt makes me very ssuspect of how capable these cores really are.
Posted on Reply
#23
R0H1T
Wirko
It's obvious that Alder Lake is designed for notebooks first and foremost, and desktops chips come as an afterthought. I expect desktop CPUs to come with fewer small cores, or none.


A core can resort to software emulation when it encounters an unknown instruction. So a thread that runs on a small core and uses AVX doesn't need to crash; but the scheduler needs to detect such a situation quickly and move the thread to a large core.


Browsers are multithreaded too. Of course, someone (developers? compiler? scheduler?) needs to know which tasks may be offloaded to slower cores.


I'm wondering to what extent this is true. Intel, AMD and Arm engineers certainly have a lot of input in the development process. As Intel is a major contributor to the Linux project, I'm sure they are busy tweaking its scheduler all the time, too.


Absolutely, but it also needs to be aware of the peculiarities of each architecture. It also needs to be very fast, which limits its flexibility and the ability to gather runtime statistics, do complex calculations and adapt. We had a similar discussion about Zen 5 recently, here.
You're just speculating, it was designed for "more" efficiency but that's about it.

I'm thinking the opposite 16c/24t is an actual selling point vs 5900x or 5950x for Intel ~ the chance of this SKU (125W TDP) making it to notebooks is close to zero!

Why does a thread running on small core even need to run AVX, if it doesn't support it?
Posted on Reply
#24
efikkan
Wirko
It's obvious that Alder Lake is designed for notebooks first and foremost, and desktops chips come as an afterthought. I expect desktop CPUs to come with fewer small cores, or none.
Yes. I would probably suggest most desktop users to buy the ones with 6 or 8 fast cores only, unless they want to be beta testers of course.
I might have to get one of the hybrid ones though, but only because I might have to verify some software.
Wirko
A core can resort to software emulation when it encounters an unknown instruction. So a thread that runs on a small core and uses AVX doesn't need to crash; but the scheduler needs to detect such a situation quickly and move the thread to a large core.
By software(?) emulation I presume you mean that the CPU frontend will translate it into different instructions (hardware emulation), which is what modern x86 microarchitectures does already; all FPU, MMX, SSE instructions are converted to AVX. This is also how legacy instructions are implemented.

But there will be challenges when there isn't a binary compatible translation, e.g. FMA operations. Doing these separately will result in rounding errors. There are also various other shuffling etc. operations in AVX which will require a lot of instructions to achieve. In such cases I do wonder if the CPU will just freeze the thread and ask the scheduler to move it, because this detection has to happen on the hardware side.

One additional aspect to consider, is that Linux distributions are moving to shipping versions where the entire software repositories are compiled with e.g. AVX2 optimizations, so virtually nothing can use the weak cores, so clearly Intel made a really foolish move here.
Wirko
Browsers are multithreaded too. Of course, someone (developers? compiler? scheduler?) needs to know which tasks may be offloaded to slower cores.
Developers have fairly little control over this in normal Windows or Linux environments. They can control how many threads are spawned, set attributes like affinity and priority, and of course detect CPU features such as core count, SMT, etc. at runtime. But the actual management of this is done by the OS scheduler.

Browsers like Chrome already spawn an incredible amount of threads. Normally, very few of these have any major load, but the browser may still need to synchronize them, so this can be an issue if important low-load threads end up cores which are slow to respond. I know Chrome gets slow due to high thread count long before CPU load with high tab count.
Wirko
I'm wondering to what extent this is true. Intel, AMD and Arm engineers certainly have a lot of input in the development process. As Intel is a major contributor to the Linux project, I'm sure they are busy tweaking its scheduler all the time, too.

Absolutely, but it also needs to be aware of the peculiarities of each architecture. It also needs to be very fast, which limits its flexibility and the ability to gather runtime statistics, do complex calculations and adapt. We had a similar discussion about Zen 5 recently, here.
Windows and the default Linux kernel have very little x86 specific code, and even less specific to particular microarchitectures. While you certainly can compile your own Linux kernel with a different scheduler, compile time arguments and CPU optimizations, this is something you have to do yourself and keep repeating every time you want kernel patches.

So with a few exceptions, the OS schedulers are running mostly generic code.
They do however as the dragon tamer said in your link, do a lot of heuristics and adjustments in runtime, including moving threads around for distributing heat. Whether these algorithms are "optimal" or not depends on the workload. We'll see if this changes when Intel and AMD releases hybrid designs, you better prepare for a bumpy ride.
Posted on Reply
#25
voltage
The biggest mistake Intel can make for this release is not releasing Alder Lake with DDR5 from the start. I am curious to see if they do or not. If they do get "gutsy" and release with DDR5 from the start I'm in, otherwise no thanks.
LemmingOverlord
So does the ID "Alder Lake-S 881" mean 8 big, 8 little, 1 GPU?
yes.
Posted on Reply
Add your own comment