Monday, September 11th 2023

Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor

The 4 nm "Phoenix 2" monolithic APU silicon powering the lower end of AMD's Ryzen 7040-series mobile processors, could very well be the company's first hybrid core processor, even though the company doesn't advertise it as such. We first caught whiff of "Phoenix 2" back in July, when it was described as being a physically smaller chip than the regular "Phoenix." It was known to have just 6 CPU cores, and a smaller iGPU with 4 RDNA3 compute units; in comparison to the 8 CPU cores and 12 compute units of the "Phoenix" silicon. At the time a lack of 2 CPU cores and 8 CUs were known to be behind the significant reduction in die size from 178 mm² to 137 mm², but it turns out that there's a lot more to "Phoenix 2."

A die shot of "Phoenix 2" emerged on Chinese social media platform QQ, which reveals two distinct kinds of CPU cores. There are six cores in all, but two of them appear larger than the other four. The obvious inference here, is that the larger cores are "Zen 4," and the smaller ones are the compacted "Zen 4c." The "Zen 4c" core has the same core machinery as "Zen 4," albeit it is re-arranged to favor lower area on the die. The trade-off here is that the "Zen 4c" core operates at lower voltages and lower clock-speeds than the regular "Zen 4" cores. At the same clock speeds, both kinds of cores have an identical IPC. The two also have an identical ISA, so any software threads migrating between the cores will not encounter runtime errors. Unlike Intel Thread Director, AMD can use a less sophisticated software-based solution to ensure that the right kind of workload is allocated to the right kind of cores, and prevent undesirable migration between the two kinds of cores. Unlike the hardware-based Thread Director, AMD's solution can be continually updated.
Sources: HXL (Twitter), VideoCardz
Add your own comment

62 Comments on Die-shot Suggests "Phoenix 2" is AMD's First Hybrid Processor

#51
AnotherReader
unwind-protectWell, this will still require the OS scheduler to be aware of which core is potentially fast and which one has a lower maximum speed limit. Imagine that a heavily multicore load suddenly morphs into a single thread that was slacking off. You want that thread on one of the potentially high clocked cores, but it might sit on a low core right now. So you have to make a decision whether to move it, which is a very difficult decision to make for a scheduler since it doesn't know how much longer this particular situation will last.

In other words, some losses from sub-optimal scheduling are unavoidable if you have faster and slower cores.

BTW, this concept is much older than E-cores on 12th gen Intel. They had Xeon chips with cores with different clock limits way before.
Yes, this disparity is nothing new and existed before E cores as well. Scheduling is a NP-hard problem even in the case of identical parallel processors so you'll never get an optimal scheduler.
Posted on Reply
#52
lexluthermiester
AnotherReaderI think there's much ado about nothing here.
YES! Thank You.
fevgatosSo if a thread is sent to the 4c core, performance will suffer, just like with ecores. Yes?
Depends on the thread, it's function and whether or not the clock speed will make a difference. Core scheduling is defined, but still complicated. If the OS kernel shifts a thread from one core to a slower core, it's either because that thread has a lower priority or is under-utilizing the core it's running on. A thread being shifted to a slower/lower tier core can also be the OS prioritizing and optimizing in real-time.

For Intel's Big/Little design, this can and frequently does result in degraded performance for the thread shifted to an E-Core, because the "little" cores are of a different(less efficient) design and thus much lower IPC. With the AMD version of it in this example, the same dynamic doesn't exist because the "little" core is functionally identical to the "Big" core, just slower clocking, which means the IPC is the same, but the clock speed is slower.

Put another way, Intel's Big/little design results is significant degradation of thread performance due to the differences not only in clock speed but in core instruction execution capabilities. AMD's Big/Little seems a much better way of doing it as the difference is in clock speed alone.

Does this make sense?
unwind-protectThey had Xeon chips with cores with different clock limits way before.
True, but they were all the same cores IIRC. The per-core clock limitations were microcode imposed.
Posted on Reply
#53
fevgatos
lexluthermiesterDepends on the thread, it's function and whether or not the clock speed will make a difference. Core scheduling is defined, but still complicated. If the OS kernel shifts a thread from one core to a slower core, it's either because that thread has a lower priority or is under-utilizing the core it's running on. A thread being shifted to a slower/lower tier core can also be the OS prioritizing and optimizing in real-time.

For Intel's Big/Little design, this can and frequently does result in degraded performance for the thread shifted to an E-Core, because the "little" cores are of a different(less efficient) design and thus much lower IPC. With the AMD version of it in this example, the same dynamic doesn't exist because the "little" core is functionally identical to the "Big" core, just slower clocking, which means the IPC is the same, but the clock speed is slower.

Put another way, Intel's Big/little design results is significant degradation of thread performance due to the differences not only in clock speed but in core instruction execution capabilities. AMD's Big/Little seems a much better way of doing it as the difference is in clock speed alone.

Does this make sense?
Sure, but the end result is, if the scheduler ***cks up, you lose performance.

Thank god it's not happening on intel thanks to the thread director
Posted on Reply
#54
lexluthermiester
fevgatosSure, but the end result is, if the scheduler ***cks up, you lose performance.
That happens anyway. In non-big/little systems if the scheduler encounters an error, it dumps the current work and restarts the thread. There is no difference there. And that happens regardless of the type or manufacturer of the CPU. That's not really something to focus on.
Posted on Reply
#55
Redwoodz
fevgatosIf the Zen 4c core would perform as well as the full fat core then there wouldn't be any full fat cores. Obviously that is not the case, zen 4c will be slower so it will have the same "issues" ecores do.
No. When all cores are loaded, the mobile platform will reduce the total available max speed anyway, so in essence you have the same total performance as if you had 6 full cores. The ecores problem is sofware related, which this has nothing to do with other than boost algorithms.
Posted on Reply
#56
unwind-protect
fevgatosSure, but the end result is, if the scheduler ***cks up, you lose performance.

Thank god it's not happening on intel thanks to the thread director
LOL

Yeah, I am sure thread director works perfectly, which is only possible if it can see into the future.

Unless you were being sarcastic, in which case I apologize.
Posted on Reply
#57
chrcoluk
I think hybrid is going to be the future for both companies, but I agree that its early days and its not good everything doesnt just work 100% optimised out of the box. (Although on Win11 is still reasonably good out of the box due to thread director).

However my research and investigation into improving things has led me to learn some exciting discoveries about CPU scheduling in windows and the hidden power schema settings, I have started documenting it as well, however the few attempts I have tried to share some of this stuff on the net, no one has bitten, I seem to be the only one excited by it. :)

W1zzard briefly got interested but only on the NVME power saving states.
Posted on Reply
#58
fevgatos
unwind-protectLOL

Yeah, I am sure thread director works perfectly, which is only possible if it can see into the future.

Unless you were being sarcastic, in which case I apologize.
My experience has been perfect thus far. All the games I've tried work better with ecores on. I've heard star citizen doesn't like ecores but never tried it.
Posted on Reply
#59
unwind-protect
fevgatosMy experience has been perfect thus far. All the games I've tried work better with ecores on. I've heard star citizen doesn't like ecores but never tried it.
That doesn't mean that the whole shebang is running optimally. Not slowing down with E-Cores is a very low bar, especially if your applications have less threads than you have P-cores in the first place.
Posted on Reply
#60
fevgatos
unwind-protectThat doesn't mean that the whole shebang is running optimally. Not slowing down with E-Cores is a very low bar, especially if your applications have less threads than you have P-cores in the first place.
I didn't say they don't slow down. I said they actually run better.

But - still, what do you mean "low bar". What would be the high bar?
Posted on Reply
#61
Chrispy_
dyonoctisI just feel likes this whole debate will ultimately depends on whether or not AMD decides to limit the clock of zen 4C vs classic zen 4.
How is that different to turbo boost that we already have?

Unlike desktops with 142-230W package power, mobile chips really do clock all the way down under all-core loads. The 15W 6800U, for example, really does drop from 4.7GHz on single-threaded loads to under 3GHz when rendering. The 6800U's eight full-fat Zen4 cores are already operating in a way similar to 2x Zen4 and 6x Zen4C, simply because there are "preferred cores" which are the two marked as the best for high boost clocks. By the time the third core is engaged, clocks have already dropped 500MHz, and they'll lose another GHz as the rest of the cores are loaded and the laptop approaches its STAPM limits.

The clocks of Zen4C will definitely be limited by their own stability at sensible voltages, but many of the cores in a full-fat Zen4 processor are already limited anyway by power targets, so the die area spent on giving them the potential ability to clock higher is going to waste, since if there's ever power budget to spare, the cores that that get boosted to 4.7GHz are the two preferred cores.
Posted on Reply
#62
ToTTenTranz
dyonoctisI just feel likes this whole debate will ultimately depends on whether or not AMD decides to limit the clock of zen 4C vs classic zen 4.
Zen4c clocks lower by design. They're using a denser transistor library, meaning it'll consume less power when the clocks are lower (shorter paths for the current to go through) but at the same time there's more heat density at the same clocks so it can't clock as high.
In practice, this means the Zen4c cores will have different power/frequency curves than Zen4.

It should be a bit like ARMs big vs. LITTLE frequency curves, though less far apart because it's still essentially the same core.
Posted on Reply
Add your own comment
May 13th, 2024 09:08 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts