Thursday, November 21st 2019

AMD Admits "Stars" in Ryzen Master Don't Correspond to CPPC2 Preferred Cores

AMD in a blog post earlier today explained that there is no 1:1 correlation between the "best core" grading system displayed in Ryzen Master, and the "preferred cores" addressed by the Windows 10 Scheduler using CPPC2 (Collaborative Power and Performance Control 2). Deployed through BIOS and AMD chipset drivers, CPPC2 forms a middleware between OS and processor, communicating the system's performance demands at a high frequency of 1 ms (Microsoft's default speed for reporting performance states to processors is 15 ms). Ryzen Master, on the other hand, has had the ability to reveal the "best" cores in a Ryzen processor by ranking them across the package, on a CCD (die), and within a CCX. The best core in a CCX is typically marked with a "star" symbol on the software's UI. The fastest core on the package gets a gold star. Dots denote second fastest cores in a CCX.

Over the past couple of months we've posted several investigative reports by our Ryzen memory overclocking guru Yuri "1usmus" Bubly, and a recurring theme with our articles has been to highlight the discrepancy between the highest performing cores as tested by us not corresponding to those highlighted in Ryzen Master. Our definition of "highest performing cores" has been one that's able to reach and sustain the highest boost states, and has the best electrical properties. AMD elaborates that the CPPC2 works independently from the SMU API Ryzen Master uses, and the best cores mapped by Ryzen Master shouldn't correspond with preferred cores reported by CPPC2 to the OS scheduler, so it could send more workload to these cores, benefiting from their higher boosting headroom.
The "best cores" as defined by SMU and reported by Ryzen Master are hence decided on the basis of electrical properties, and hard-coded at the time of die binning in the factory. The "preferred cores" as defined by CPPC2 are those cores to which AMD wants the OS scheduler to send the most traffic to, not just on the basis of their superior physical or electrical properties, but also being optimal for Windows scheduler core rotation policy. Windows scheduler is programmed to not keep a long application work thread allocated to a particular core indefinitely, but to periodically rotate it between a pair of two cores. The rationale behind this is thermal management (spreading the heat across two cores that are spatially apart).

On monolithic multi-core chips such as the i9-9900 or i9-9980XE, in which all cores not only sit on the same die, but are also part of the same group (no CCX here), core rotation works as intended, as all cores share the L3 cache, and a relieving core can pick up work from where its rotation pair partner has left off, by pulling data from the L3 cache.

AMD's "Zen" multi-core topology complicates this, as not all cores share the same L3 cache; and in 12-core, 16-core, or Threadrippers, not all cores sit on the same die. This is where CPPC2 fits in, giving Windows the awareness of the topology it needs, so it can rotate threads among cores without hurting performance by forcing workloads onto a core that uses a separate instance of cache, which forces data reloads from RAM. So how does CPPC2-reported "favored cores" fit into the scheme of things? CPPC2 deliberately misreports "favored cores" to the Windows scheduler — to build core rotation pairs within localized groups of cores, rather than picking cores from different CCXs or CCDs to build rotation pairs.

"Ryzen Master, using firmware readings, selects the single best voltage/frequency curve in the entire processor from the perspective of overclocking. When you see the gold star, it means that is the one core with the best overclocking potential. As we explained during the launch of 2nd Gen Ryzen, we thought that this could be useful for people trying for frequency records on Ryzen," reads the AMD blog on the discrepancy between Ryzen Master "best cores" and CPPC2 Preferred Cores. "Overall, it's clear that the OS-Hardware relationship is getting more complex every day. In 2018, we imagined that the starred cores would be useful for extreme overclockers. In 2019, we see that this is simply being conflated with a much more sophisticated set of OS decisions, and there's not enough room for nuance and context to make that clear. That's why we're going to bring Ryzen Master inline with what the OS is doing so everything is visibly in agreement, and the system continues along as-designed with peak performance," it adds. "Best cores" and "preferred cores" are hence both "right." The former refers to a physically high-quality core, while the other is more "circumstantial", for better performance. Sources: Reddit, Anandtech
Add your own comment

67 Comments on AMD Admits "Stars" in Ryzen Master Don't Correspond to CPPC2 Preferred Cores

#51
bug
Khonjel
So basically: Best Core is absolute best core as validated by AMD from the factory.
Preferred Core is relative best core assigned by Windows Scheduler.

AMD gonna update ryzen master in the future to match Windows Scheduler.

Is that everything.
Not really. Best core is what AMD marks from factory.
Preferred core is what AMD picks/exposes through CPPC.

It's really AMD disagreeing with themselves.
Posted on Reply
#52
Oberon
bug
Not really. Best core is what AMD marks from factory.
Preferred core is what AMD picks/exposes through CPPC.

It's really AMD disagreeing with themselves.
They're not disagreeing with themselves, they made a decision to show something different in Ryzen Master than is reported by other utilities.
Posted on Reply
#53
bug
Oberon
They're not disagreeing with themselves, they made a decision to show something different in Ryzen Master than is reported by other utilities.
There are no other utilities. CPPC is part of ACPI which is part of the BIOS/UEFI. Utilities will build on top of that, but the ACPI tables still come from AMD.
Posted on Reply
#54
Chrispy_
john_
Sell it and buy an Intel. You'll feel better. Right?
If not better, at least he'll be warmer.
Posted on Reply
#55
Oberon
bug
There are no other utilities. CPPC is part of ACPI which is part of the BIOS/UEFI. Utilities will build on top of that, but the ACPI tables still come from AMD.
Right, and those utilities are reporting the CPPC values. Ryzen Master is not. There is no conflict.
Posted on Reply
#56
Kaotik
bug
There are no other utilities. CPPC is part of ACPI which is part of the BIOS/UEFI. Utilities will build on top of that, but the ACPI tables still come from AMD.
Yes, but like AMD has explained, Ryzen Master doesn't report that. It reports the best core based on it's properties at factory, but due nature of how Windows scheduling works, it isn't necessarily the best core in Windows (because really Windows wants 2 best cores, not just one, since it passes the single threaded load between 2 cores for power management etc) - CPPC2 reports the best 2 cores within same CCX, because 2 best cores overall could be in 2 different CCXs which would hurt performance when the thread is thrown around. For the same reason 3rd and 4th best cores are always marked on the same CCX as the best 2 even if they're actually the worst 2 cores on the CPU, it's better for performance to fill one CCX before spilling on to the other CCX.
Posted on Reply
#57
bug
Kaotik
Yes, but like AMD has explained, Ryzen Master doesn't report that. It reports the best core based on it's properties at factory, but due nature of how Windows scheduling works, it isn't necessarily the best core in Windows (because really Windows wants 2 best cores, not just one, since it passes the single threaded load between 2 cores for power management etc) - CPPC2 reports the best 2 cores within same CCX, because 2 best cores overall could be in 2 different CCXs which would hurt performance when the thread is thrown around. For the same reason 3rd and 4th best cores are always marked on the same CCX as the best 2 even if they're actually the worst 2 cores on the CPU, it's better for performance to fill one CCX before spilling on to the other CCX.
They can report whatever they want, I was only saying this confusion is entirely on AMD, not on Microsoft. It's really no biggie, they should have paid more attention and use more appropriate terms than "best" and "preferred", but now that boat has sailed.
Posted on Reply
#58
Kaotik
bug
They can report whatever they want, I was only saying this confusion is entirely on AMD, not on Microsoft. It's really no biggie, they should have paid more attention and use more appropriate terms than "best" and "preferred", but now that boat has sailed.
Actually it's on Microsoft if on anyone. Microsoft is the one who decided Windows schedulers needs to use 2 cores to optimize power management etc, AMD or Intel or any other CPU manufacturer doesn't require that nor do (all of) other schedulers in other OSes. If they'd use just one core instead of 2 for single threaded workloads, it would be the same best core Ryzen Master reports at the moment.
Posted on Reply
#59
bug
Kaotik
Actually it's on Microsoft if on anyone. Microsoft is the one who decided Windows schedulers needs to use 2 cores to optimize power management etc, AMD or Intel or any other CPU manufacturer doesn't require that nor do (all of) other schedulers in other OSes. If they'd use just one core instead of 2 for single threaded workloads, it would be the same best core Ryzen Master reports at the moment.
Sure, sure...
Posted on Reply
#60
Kaotik
bug
Sure, sure...
Could you please elaborate on your logic?
These are undisputed facts which anyone can check:
- AMD lists in Ryzen Master the best cores per CPU and per CCX based on their properties at the factory
- All OS schedulers don't require 2 cores for single thread load
- AMD doesn't require 2 cores for single threaded load (nor does Intel or any other CPU manufacturer)
- Microsoft requires 2 cores for single thread load

And somehow, it's AMDs "fault" when they report those best cores per CPU and per CCX based on their properties at the factory instead of reporting the best 2 cores within same CCX as "best cores" because Windows wants 2 cores for 1 thread?
Posted on Reply
#61
bug
Kaotik
Could you please elaborate on your logic?
These are undisputed facts which anyone can check:
- AMD lists in Ryzen Master the best cores per CPU and per CCX based on their properties at the factory
- All OS schedulers don't require 2 cores for single thread load
- AMD doesn't require 2 cores for single threaded load (nor does Intel or any other CPU manufacturer)
- Microsoft requires 2 cores for single thread load

And somehow, it's AMDs "fault" when they report those best cores per CPU and per CCX based on their properties at the factory instead of reporting the best 2 cores within same CCX as "best cores" because Windows wants 2 cores for 1 thread?
Ryzen Master shows the two cores per CCX designated as "best" by AMD during manufacturing. That same AMD then goes to expose "preferred cores" through CPPC, but those are not necessarily the same as those designated as "best" (again, by AMD). If that all makes sense to you, you're a better man than I.
Posted on Reply
#62
Kaotik
bug
Ryzen Master shows the two cores per CCX designated as "best" by AMD during manufacturing. That same AMD then goes to expose "preferred cores" through CPPC, but those are not necessarily the same as those designated as "best" (again, by AMD). If that all makes sense to you, you're a better man than I.
The reason they show different cores via CPPC(2) is the fact that Windows requires 2 cores instead of one, and due the way AMD built Zen those two need to be in same CCX for optimal performance (and further, preferrably not next to each other physically to optimize temperatures). Windows is just too big to ignore, in fact you really have to make most of your decisions on these things solely on how Windows works, because it's used by most machines (outside servers and supercomputers).
However, that doesn't change the fact that what Ryzen Master currently shows is correct for what it's saying it's showing, they are the best cores just like it says. In Windows that's just not relevant for anything but finding highest possible frequencies, aka HC overclockers looking for record frequencies which is done on just one core. AMD never claimed they're showing anything but the potentially highest clocking cores based on CPUs physical properties.
If you'd use OS whose scheduler wants just that one core for one thread, what Ryzen Master currently shows would be optimal (and next 3 threads would be assigned to same CCX as the best one even when they're not the 2nd, 3rd and 4th best cores on the CPU, because spilling to another CCX hurts performance)
Posted on Reply
#63
Naito
bug
So you're saying Microsoft should have had a CCX aware scheduler, before AMD launched the CCX?
No, but having said that, they should have (and probably did) worked together to optimize prior to launch. We can only speculate how the Windows scheduler works/doesn't work, but I would say it'd be naive to think that years of Intel dominance didn't play at least some part in shaping it.

bug
Now that even AMD can't point out the "best" core accurately
Again, no. It seems to be a miscommunication on their behalf. Ryzen Master shows only a rank given based on the quality of the core within the CCX. One could assume metrics used are voltage, maximum clock, heat, etc. CPPC2 considers not only the physical qualities of the core, but also which CCX the core resides, the location of the core within the CCX, cache access, etc.
Posted on Reply
#64
ORLY
I've just one question: if there are 2 CCX (0 and 1), will using CCX1 over CCX0 cause some kind of performance penalty?
Just wondering, because in my CPU the best core is located on CCX1, but Windows chooses to fully load CCX0 first and only then it loads CCX1, so may be it does so because not to load CCX0 first and to load CCX1 instead will cause some performance penalty?..
Posted on Reply
#65
Kaotik
ORLY
I've just one question: if there are 2 CCX (0 and 1), will using CCX1 over CCX0 cause some kind of performance penalty?
Just wondering, because in my CPU the best core is located on CCX1, but Windows chooses to fully load CCX0 first and only then it loads CCX1, so may be it does so because not to load CCX0 first and to load CCX1 instead will cause some performance penalty?..
No, different CCXs are equal to each other. Loading one CCX fully before spilling to next is better for performance. In your case, 2 best cores from Windows' perspective are within CCX0 even if best core based on it's properties at factory is on CCX1.
edit: fixed typos (not going to promise there isn't still some in there)
Posted on Reply
#66
ORLY
Kaotik
No, different CCXs are equal to each other. Loading one CCX fully before spilling to next is better for performance. In your case, 2 best cores from Windows' perspective are within CCX0 even if best core based on it's properties at factory is on CCX1.
edit: fixed typos (not going to promise there isn't still some in there)
That's messed up then. Windows loaded it this way:
1 thread - the fastest core on CCX0
2 threads - the fastest and the second fastest core on CCX0
3 threads - all the cores on CCX0
4 threads - all the cores on CCX0 and the golden star core which is on CCX1
Posted on Reply
#67
Kaotik
ORLY
That's messed up then. Windows loaded it this way:
1 thread - the fastest core on CCX0
2 threads - the fastest and the second fastest core on CCX0
3 threads - all the cores on CCX0
4 threads - all the cores on CCX0 and the golden star core which is on CCX1
Curious, it should be fastest and 2nd fastest core on CCX0 for 1 threads already and to be honest I'm not sure on 2 threads if it's 2 cores or 3 (and cba to test, I got so much stuff running)
Posted on Reply
Add your own comment