• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Hygon Prepares 128-Core, 512-Threaded x86 CPU with Four-Way SMT and AVX-512 Support

You are redefining what "IPC" means to suit your argument. I gave you detailed test results which you simply ignore. There's not much more I can do here.
Not true. The definition of IPC has always been the same; instructions per clock for a CPU core. Facts are not subject to your opinion, and yet you keep twisting and diverting when confronted with the truth…

It's primarily the CPU vendors themselves at fault for creating confusion and turning "IPC" into a marketing gimmick. (But also big tech YouTubers/websites commonly misuses technical terms, and while many have been into tech for many years still lack the deep knowledge of CPU architectures, machine code and software design.) IPC and performance per clock may be very different, especially when you have different performance characteristics, and even benchmarking with different feature levels or ISAs all together. Take for instance one CPU running a test with AVX-512 and one with AVX2, first will execute fewer instructions per clock yet have higher performance than the latter. Or comparing Zen 2/3 to the Skylake family; Zen having more execution ports but a weaker front-end, resulting some workloads performing significantly better on one or the other.

The same is by all indicators the case for this Hygon CPU too; it's by far easier to achieve some performance by adding lots of execution ports first, and then optimize how to feed them later. And to some extent for Zen 5 too; increasing ALUs 4->6 didn't have a major impact across the board like "leakers" expected, but it will likely lead to gains when the front-end matures with Zen 6 and later revisions.
 
Not w music production. A VST runs everything on one thread and when it pushes it hard.. see my above post...
Okay, it doesn't benefit all applications, and there are many that actually get worse performance, but we're talking about the general case and not really solely about music production.

Absolutely not, it's a common misconception that IPC is performance per clock, when it's not, it's the amount of instructions the CPU is able to churn through. Whether there is one, two or more threads sharing a core's resources, the IPC remains constant. SMT does improve the saturation of the core for some workloads, but the total performance will only converge towards a fully thread fully saturating the core, never above that. This should be basic knowledge about CPUs.
IPC is instructions per clock, I don't see what you're trying to get at with a different definition.
If a core's EUs cannot be saturated with a single thread stream, and using tricks such as SMTs allows you to saturate it with more instructions being retired, and thus IPC does go up, period.
As mentioned above, it seems like you're just dismissing factual benchmarks even for no apparent reason.
 
Not true. The definition of IPC has always been the same; instructions per clock for a CPU core. Facts are not subject to your opinion, and yet you keep twisting and diverting when confronted with the truth…

It's primarily the CPU vendors themselves at fault for creating confusion and turning "IPC" into a marketing gimmick. (But also big tech YouTubers/websites commonly misuses technical terms, and while many have been into tech for many years still lack the deep knowledge of CPU architectures, machine code and software design.) IPC and performance per clock may be very different, especially when you have different performance characteristics, and even benchmarking with different feature levels or ISAs all together. Take for instance one CPU running a test with AVX-512 and one with AVX2, first will execute fewer instructions per clock yet have higher performance than the latter. Or comparing Zen 2/3 to the Skylake family; Zen having more execution ports but a weaker front-end, resulting some workloads performing significantly better on one or the other.

The same is by all indicators the case for this Hygon CPU too; it's by far easier to achieve some performance by adding lots of execution ports first, and then optimize how to feed them later. And to some extent for Zen 5 too; increasing ALUs 4->6 didn't have a major impact across the board like "leakers" expected, but it will likely lead to gains when the front-end matures with Zen 6 and later revisions.
I pasted a benchmark in which Zen 5 doubles its instructions per cycle (which is the same as clock just to be 100% sure of our definitions), not throughput, when Op Cache is exhausted while using SMT a few posts back. That is the fact I'm using in my argument, what's yours? So far you've used statements of how you think things work (or should work), or quotes that specifically are about only certain classes of workloads.
You accuse me of different things while doing some of them yourself, ironically.

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2659f108-5039-4dfc-ae47-8e4b8a8f9ba3_1140x530.png

Anyway, I think I'm done with this discussion. It made me run a few benchmarks and do a fair bit of reading, so thanks for that.
 
Okay, it doesn't benefit all applications, and there are many that actually get worse performance, but we're talking about the general case and not really solely about music production.
But it's the best example and the most demanding on a per core basis
 
But it's the best example and the most demanding on a per core basis
That's relative. "most demanding" according to what?
Int throughput? Fp? Branching? Memory? Would be nice if you could bring up some metrics/profiling data to point one of those as well, so it makes it easy to compare to other scenarios.

I also don't think it's a "best example" of anything given that it's not a really widely used task to begin with, at least compared to other use cases.
 
Back
Top