• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Your opinion - how much IPC improvement could be squeezed out of a new CPU gen if normal limits are removed?

qubit

Overclocked quantum bit
Joined
Dec 6, 2007
Messages
17,865 (2.79/day)
Location
Quantum Well UK
System Name Quantumville™
Processor Intel Core i7-2700K @ 4GHz
Motherboard Asus P8Z68-V PRO/GEN3
Cooling Noctua NH-D14
Memory 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz)
Video Card(s) MSI RTX 2080 SUPER Gaming X Trio
Storage Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB
Display(s) ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible)
Case Cooler Master HAF 922
Audio Device(s) Creative Sound Blaster X-Fi Fatal1ty PCIe
Power Supply Corsair AX1600i
Mouse Microsoft Intellimouse Pro - Black Shadow
Keyboard Yes
Software Windows 10 Pro 64-bit
The article below on an apparent IPC improvement of 10-15% for Zen 2 got me thinking of the following hypothetical scenario.

Today's x86 CPUs are pretty well optimized for IPC with large caches, out of order execution etc. So, how much further do you think Intel or AMD could improve IPC performance if they went all out to absolutely maximise it if cost, power use, wafer yield etc didn't matter?

Do you think 50-100%, or maybe even more might be possible? Perhaps we're reaching the point of diminishing returns and it won't improve much more regardless of what resources are thrown at it? I have no idea, just throwing this out there.

EDIT: IPC = Instructions Per Clock.

https://wccftech.com/amds-zen-2-ipc-uplift-will-be-in-the-10-15-range
 
Last edited:
I'm not that tech literate but I think the frequency scaling is very difficult and recent security scares have dialled back actual IPC based on 'predictive' solutions. I think we need to move away from silicon, as we know it to massively ramp up IPC.

But, like I say, I am not knowledgeable...
 
Good thing is that IPC it's not important anymore like it was 10 years ago and this trend will continue with new games which use DX12 like they should, so now more cores is trend and IPC will be just another footnote.
 
I'm not that tech literate but I think the frequency scaling is very difficult and recent security scares have dialled back actual IPC based on 'predictive' solutions. I think we need to move away from silicon, as we know it to massively ramp up IPC.

But, like I say, I am not knowledgeable...
Yes, also I don't believe Zen 2 will get to 15% higher IPC than PR, on avg I'd expect not more than 10% improvement over SR, after accounting for the spectre fixes. With 7nm GF, I expect AMD to be able to reach 5GHz for desktop chips, even if we're talking about single core turbo or OCed speeds, one or the other.

Moving away from Silicon to gain IPC, do you mean frequency?
 
Good thing is that IPC it's not important anymore like it was 10 years ago and this trend will continue with new games which use DX12 like they should, so now more cores is trend and IPC will be just another footnote.

The thing is, without proper tools, coding for multi-thread is quite complex and as long as there will be a big amount of Dx11 GPU, dev can't make a game that supports "only" DX12 (or Vulkan).

So IPC is still important, in a way where even if you have a threadripper, a core i3 with a higher IPC nuke its Ryzen counterpart in heavy single thread games and apps.

But yeah, it's less important than back in the days, but still relevant.
 
Ryzen has cache performance issues. Ryzen 2 focuses on addressing that. 10-15% is reasonable because it is that bad in Ryzen.
 
Ryzen has cache performance issues. Ryzen 2 focuses on addressing that.
Ryzen is really a good card for AMD; there's still plenty of room left for improvements on the IPC side and I think they already master the core count part.

EDIT; Look @qubit the next incoming article from TPU! :p What a coincidence.
EDIT2; can you TL;DR the PM @eidairaman1? :)
 
Last edited:
Check your PMs @qubit for my thoughts.
 
I agree with the comment we need to move away from silicon. The gates are already getting too small. The smaller the gate (gap), the easier it is for the voltage to jump (arc) across it. And the more gates you jam into the same space, the smaller each gate becomes.

The problem becomes, how to do push enough voltage into the IC while keeping that voltage low enough so it does not arc across gates? The smaller the gate, the lower the voltage must be to prevent arcing. Not an easy task at that scale without superconductors.

Affordable zero resistance conductors need to become a reality before we can go much further.
 
Ryzen was about 30-40% faster than FX series IIRC, so that would be about the max we could ever expect

10-20% seems logical between generations when theres competition, 5-10% when there is not
 
Good thing is that IPC it's not important anymore like it was 10 years ago and this trend will continue with new games which use DX12 like they should, so now more cores is trend and IPC will be just another footnote.
I still think that IPC is still the most important metric for CPU performance, because everything else hangs off it. It's like the fastest car in a drag race. Imagine a CPU that's got just a 5% IPC performance advantage over another one. Spread over many cores, that advantage multiplies, making the gain quite significant. No wonder that Intel is scrambling to head off AMD, as they will understand this better than anyone. With AMD offering so many cores on top of this, we see Intel scrambling to compete.
 
It's entirely dependent on application.
The Ipc of some types of computation can be improved immensely.
While others are close to the limit with the tech we have.

Imagine one core with the register width and alus of a quad , would it do four times the math per clock, possibly but it may bottleneck in some scenario's.
There is a limit and it's presently defined by register sizes imho.

A new computing platform might overcome this such as HPs in memory computing beomoth.
 
Check your PMs @qubit for my thoughts.

Too many curse words and tentacle-porn references for public posting?

I've no idea, but I highly doubt >50% of what we have now.
 
Too many curse words and tentacle-porn references for public posting?

I've no idea, but I highly doubt >50% of what we have now.

For 1 I do not look at porn, 2 I stopped cursing. 3 You don't know me, so stop assuming you do. 4 Since you have nothing constructive to say go else where instead of insulting a member.

PS You have been on my ignore list for several years now. You will remain there.
 
Last edited:
Don't take his comments personally @eidairaman1, I'm pretty sure his comment was a light hearted jest not aimed at you specifically.
 
Don't take his comments personally @eidairaman1, I'm pretty sure his comment was a light hearted jest not aimed at you specifically.

He has taken jabs at me in the past, so yeah sure "rolling eyes"
 
leave the drama be guys, just leave it alone and keep this on topic
 
Wow, I get a down vote for trying to be the good guy here. :shadedshu:
 
Ryzen was about 30-40% faster than FX series IIRC, so that would be about the max we could ever expect

10-20% seems logical between generations when theres competition, 5-10% when there is not

According to the slides when Ryzen launched, Ryzen had 52% increase over excavator architecture (not piledriver).

The reason AMD managed this impressive feat is simply because excavator's IPC was that bad.

10% - 20% between generations seems too much, even with competition: i'd say 5% - 12%, tops.
 
For 1 I do not look at porn, 2 I stopped cursing. 3 You don't know me, so stop assuming you do. 4 Since you have nothing constructive to say go else where instead of insulting a member.

PS You have been on my ignore list for several years now. You will remain there.

Fair enough. But seriously, why didn't you post it here? I just don't see the point.

Wow, I get a down vote for trying to be the good guy here. :shadedshu:

Downvotes is always a bad idea and it always results in groupthink and tribalism.
 
Today's x86 CPUs are pretty well optimized for IPC with large caches, out of order execution etc. So, how much further do you think Intel or AMD could improve IPC performance if they went all out to absolutely maximise it if cost, power use, wafer yield etc didn't matter?

Here's the real kicker , modern CPUs can already execute way more instruction than they will ever be able to under normal circumstances. There are many things that can be done to increase IPC , problem is you will see little improvement due to the everlasting problem that slow system memory is. Many CPUs would see a massive uplift in performance by simply pairing them with memory that can provide instructions and data at the same rate that they can process them. Think about it , memory technology changes at a much lower pace than IPC improvements and somehow manufacturers of CPU have to deal with this ever present issue that keeps on growing.

The issue as of now isn't IPC , there is still plenty of room for improvement there , the issue is that there is no point in pushing it much further. Out-of-order CPUs have something called an instruction window , which represents how many instructions you can look at ahead of time to potentially decode and execute them. You can design a CPU with a huge instruction window and many ALUs and therefore insane IPC but you will never be able to fetch enough instructions and data to make it worthwhile.
 
@Vya Domus I think memory performance could be improved to keep up with demand - remember, this is a hypothetical scenario we're talking about here of what could technically be possible with the right motivation eg a "TV competition" to make the fastest CPU.

Wouldn't a huge instruction Window increase IPC at the expense of lag, though?
 
Wouldn't a huge instruction Window increase IPC at the expense of lag, though?

Latency as in what ?
think memory performance could be improved to keep up with demand

That's the thing , it can't. It's too expensive to manufacture memory in current capacities that would be fast enough to make big leaps in IPC worthwhile.

@Vya Domus a hypothetical scenario we're talking about here of what could technically be possible with the right motivation eg a "TV competition" to make the fastest CPU.

Hypothetically you can make a very fast CPU with high IPC as I described , you wouldn't see it inside anything though. The point is CPUs are already "uselessly" fast to a degree so that wouldn't prove anything.

You can tell that IPC isn't such a challenging thing when you look at mobile ARM based processors who saw massive improvements in just a couple of years but now things slow down because it runs into the same issue.
 
Last edited:
According to the slides when Ryzen launched, Ryzen had 52% increase over excavator architecture (not piledriver).

The reason AMD managed this impressive feat is simply because excavator's IPC was that bad.

10% - 20% between generations seems too much, even with competition: i'd say 5% - 12%, tops.
Now that you mention it, how come intel managed to pull about +40% IPC with Sandy?
 
CPUs will always be a balance between thread switching speeds and IPC. There is a number of threads per core you can efficiently address simultaneously before the performance goes downhill a LOT. There was a thread on OCN a few years back and IIRC the thread switching started incurring a lot of performance loss past around 20 threads per core. So in theory you don't actually need more than around 4 cores in most situations though some games can make efficient use of 6+ these days.

Obviously you have to have a total IPC big enough to keep up with all the tasks which is why dual cores are basically dead in the water now.

Another thing to consider is that certain loads do not fully utilise all of the resources presented by a CPU core. Modern CPUs are actually moving to a point where they are designed with the assumption that not all the resources will be utilised (AVX offset is an example).

With the multi-chip CPUs we also have to start considering the inter-die latency and the delays incurred in NUMA which can further bottleneck you to a certain power of a single core.
 
Back
Top