• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

AMD Ryzen 9000X3D Series to Keep the Same 64 MB 3D V-Cache Capacity, Offer Overclocking

You're ignoring the most fundamental issue with the 7950x3d ~ it's clocks, so yes more cache on both CCD's should solve a large part of that "latency" issue as you call it.
7950x3d_clock_variation.png

7950x3d_latency_cycles.png

7950x3d_latency_ns.png


7950x3d_st_bw.png

7950x3d_st_bw_bc.png


7950x3d_mt_bw.png

7950x3d_mt_bw_bytes_per_cycle.png

7950x3d_ghpc.png
7950x3d_cp2077.png

7950x3d_dcs.png

7950x3d_cod_zombies-1.png

7950x3d_cod_zombies_topdown.png
 
You're ignoring the most fundamental issue with the 7950x3d ~ it's clocks, so yes more cache on both CCD's should solve a large part of that "latency" issue as you call it.
7950x3d_clock_variation.png

7950x3d_latency_cycles.png

7950x3d_latency_ns.png


7950x3d_st_bw.png

7950x3d_st_bw_bc.png


7950x3d_mt_bw.png

7950x3d_mt_bw_bytes_per_cycle.png

7950x3d_ghpc.png
7950x3d_cp2077.png

7950x3d_dcs.png

7950x3d_cod_zombies-1.png

7950x3d_cod_zombies_topdown.png
I'm ignoring nothing. Clocks would be reduced by putting cache on both chiplets, therefore latency would increase. You're confusing intraCCD latency with interCCD latency.

The only thing extra cache does is reduce reliance on system memory for individual CCDs, not somehow improve latency between CCDs.

Low latency for things that can fit within ~100 MB of cache is a gaming workload, not a productivity one. By making two VCache dies (i.e. hypothetical "optimized" for gaming, over single CCD) we're back to the Zen 2 problem of two CCXs within each CCD/chiplet, where workloads operating on both groups suffer from latency issues because communicating between them takes longer than communicating within them.

200 MB cache, 2x 100 MB cache.
 
If you want an all around CPU you go with the 7950X3D or the 14900K. Although the 7950X3D is slower than the 7800X3D in games due to that dual CCD compromise.

Not what I've observed locally just setting the bios to prefer cache and using thread lasso to tie all my system processes to my second CCD the 7950X3D is faster in every game I've tried it in..... Warzone, MWIII, FM, FH, not much though 3-4%

Warzone especially for some reason is way more consistent not sure why, same mobo, same ram, identical timings.

If you just let windows/mobo do it's thing sure...... it's up to 5% slower but usually similar.

I did only have the 7800X3D for about a week though and I could have gotten a dud I guess but after a couple hours tuning the 7950X3D is always faster and comically faster in MT.

Now is that worth an extra 140 usd not sure but 480 vs 340 didn't seem like too much of a price difference for how slow the 7800X3D was at non gaming task. Seems amazon has raised the price back up bummer lol.
 
And that's an issue in how many titles? The inter CCD latency will always be a thing & you simply can't wish away physics in that sense.
 
And that's an issue in how many titles? The inter CCD latency will always be a thing & you simply can't wish away physics in that sense.
What are you arguing about exactly?

Who is wishing away physics here? Those pining for dual X3D chips with imagined improved performance/efficiency? Or the engineers at AMD who tested this theory and found those chips were worse?
 
The 7950xd is a fundamentally unbalanced chip as of now, with both CCD's clocked similarly & with Vcache a lot of that "latency" thing goes away. Threads on both CCD's with similar workload should finish in about the same time. I guess the AMD engineers were also lazy doing 96c or 128c EPYC chips then? You don't think latency or thread contention is an issue in servers?
 
The 7950xd is a fundamentally unbalanced chip as of now, with both CCD's clocked similarly & with Vcache a lot of that "latency" thing goes away. Threads on both CCD's with similar workload should finish in about the same time. I guess the AMD engineers were also lazy doing 96c or 128c EPYC chips then? You don't think latency or thread contention is an issue in servers?
Again, you're confusing intraCCD latency with interCCD latency.

The "latency thing" is an issue between CCDs.

Server workload ≠ consumer workload.

Once again you're arguing something with yourself, not me. I have not raised the points you're supposedly responding to.
 
I just meant inter CCD latency. For 7950xd we have two different CCD types in the same chip/package ~ zen5 should be able to solve that & alleviate some of the latency issues. It's impossible to get rid of them completely of course.
 
I just meant inter CCD latency. For 7950xd we have two different CCD types in the same chip/package ~ zen5 should be able to solve that & alleviate some of the latency issues. It's impossible to get rid of them completely of course.
Zen 5 will not solve that, because the packaging design is the same.

At best you will now be able to overclock the X3D CCD to be closer in frequency to the non-X3D CCD, but the latency between the CCDs will still be inherently there, and the CCD that is not voltage capped will still clock better than the one that is.

Perhaps Zen 6 with the rumoured improved packaging will somewhat improve the situation.
 
Your confusion comes from the fact that you assume X3D cache is the reason for efficiency increasing, from some inherent property, when the reality is that it's the voltage limitation that comes from the stacked cache forcing AMD to not release chips tuned way past their efficiency sweetspot to get within 10-20% of Intel single core performance.

Wishful thinking along with some active imagination is what causes people to long for the hypothetical performance/efficiency of a part that did not release to market (for a good reason).

We have reviews showing the 7950X3D at the exact same voltage and settings achieving higher efficiency when set to prefer cache as prefer frequency or stock:

1720457535487.png

1720457723456.png



Fetching data from cache as opposed to main system memory takes less energy and less time. Voltage is a factor in the efficiency but not the only factor.

So explain how gaming performance would be improved with a dual X3D chip then (hint, it isn't).



Your claim was that it decreases performance:
Two sets of 3DVCache increases latency between cores on different chiplets, not decreases it, lol.

Nothing on that claim eh?

As you put in bold, for games you want to be cache resident. The problem with the 7950X3D is that for certain games the OS places the game on the cores without the cache. Having cache on both would solve that issue, thus resulting in increased performance on the 7950X in select scenarios.

Basic logic my dude. Also, FYI the dual CCD X3D chips were tested and dropped, because they did not offer improvements over single CCD or 1+1 of each type.

This is definitely not the whole picture as we know X3D has benefits outside of gaming (which was pointed out in the video)

Mind you we also know since that video was released that certain games do sometimes end up on frequency favored threads and those are the instances where performance would be improved. You can simulate the performance uplift the 7950X3D would see by using process lasso. It wouldn't exceed the 7800X3D's performance of course except for in games that use a lot of threads but it would bring the 7950X3D on par with the 7800X3D in games if not slightly ahead.

Moving across chiplets has a latency cost since they do not talk directly to each other, they move through the IF on the IO die, which essentially hasn't changed since Zen 3, still clocking at 2000 MHz, moving to non native chiplet cache has another latency cost,

The 7000 series has 8 cores per CPU chiplet. This isn't relevant for the vast majority of games and applications.

You are also assuming that said cores need data from a different CCD. Having X3D on both CCDs is likely to increase the number of local cache hits. There is a reason AMD originally intended to use X3D for it's enterprise CPUs.

plus you are now running at lower clocks dictated by the lower voltage cap of X3D, so latency is further increased.

You you are assuming that a latency increase as a result of the the lower clocks is not more than offset by latency decreases from having a large fat cache stacked on the chip.

Your "basic logic" is drawing conclusions that aren't stated in any of the sources you provide as usual.
 

Attachments

  • 1720457541883.png
    1720457541883.png
    60.9 KB · Views: 54
We have reviews showing the 7950X3D at the exact same voltage and settings achieving higher efficiency when set to prefer cache as prefer frequency or stock:
It's not the exact same voltage. It's the exact same CPU, within which there are two core chiplets, these two CCDs within the 7950X3D run at different voltages and settings, which is my entire point.

Your "gotcha" moment is either ignorant of or ignoring that fact.
Your "basic logic" is drawing conclusions that aren't stated in any of the sources you provide as usual.
Ah buddy, never change.

:laugh:

Good chat.

The theory behind this approach is that a 3DV-Cache-enabled CCD has to run at lower clock and voltage, which results in a performance tradeoff. You're getting a ton more cache, but you'll be losing frequency, which will negatively impact workloads that aren't cache-sensitive.
 

Honestly if they can just make the 9950X3D perform better than the 9800X3D without having to jump through hoops I would call it a win.... Some of that is windows but still. I shouldn't need a 3rd party program to get the most out of my cpu.

I know it works in the stuff I currently play but who knows if it always will.
 
Honestly if they can just make the 9950X3D perform better than the 9800X3D without having to jump through hoops I would call it a win.... Some of that is windows but still. I shouldn't need a 3rd party program to get the most out of my cpu.

I know it works in the stuff I currently play but who knows if it always will.
We'll see.

The upcoming generation being overclocking enabled should somewhat mitigate any theoretical performance improvement from a stock ~100 MHz higher boosting X3D chiplet on the x950X3D part over the x800X3D, assuming no scheduling issues, which, seeing as it seems Zen 5 will still use a software driver scheduler rather than a hardware scheduler like Intel Thread Director, I'm not too hopeful.
 
We'll see.

The upcoming generation being overclocking enabled should somewhat mitigate any theoretical performance improvement from a ~100 MHz higher boosting X3D chiplet on the x950X3D part over the x800X3D, assuming no scheduling issues, which, seeing as it seems Zen 5 will still use a software driver scheduler rather than a hardware scheduler like Intel Thread Director, I'm not hopeful.

I've still seen issues on Intel builds with E cores especially in Metro Exodus maybe that's been fixed by now but definitely more hoops on the amd side currently.

As cool as process lasso is I'd prefer not to have to use it.


Probably just splitting hairs though the top 5-6 gaming cpu all perform pretty damn well in most stuff.
 
I think realistically any theoretical performance improvement from a dual CCD chip will simply be from being able to use software such as Process Lasso to keep the X3D chiplet exclusively for game threads, and pin literally anything else onto the frequency chiplet. Perhaps disabling HT/SMT would gain a further slight improvement, from the physical core/thread together with no need for SMT/HT security mitigations and HT/SMT off typically running slightly cooler. That's a manual tune though, and assumes that the other two CCD architecture issues vs single CCD don't get in the way of those gains.

Process Lasso also works with Intel P+E too, similar to their APO software, so it's not really any different, except it seems out of the box the hardware Thread Director tends to work a little better than AMD's Xbox Game Bar software driver system.

One thing I've noticed is that some anti-cheat enabled games don't allow you to change thread affinity/priority, so that sucks if the default scheduler gets it wrong.
 
I think realistically any performance improvement from a dual CCD chip will simply be from being able to use software such as Process Lasso to keep the X3D chiplet exclusively for game threads, and pin literally anything else onto the frequency chiplet. Perhaps disabling HT/SMT would gain a further slight improvement, from the physical core/thread together with no need for SMT/HT security mitigations and HT/SMT off typically running slightly cooler. That's a manual tune though.

Process Lasso also works with Intel P+E too, similar to their APO software, so it's not really any different, except it seems out of the box the hardware Thread Director tends to work a little better than AMD's Xbox Game Bar software driver system.

One thing I've noticed is that some anti-cheat enabled games don't allow you to change thread affinity/priority, so that sucks if the software/hardware scheduler gets it wrong.


For me process lasso alone ony gets you 90% of the way there I also have to set prefer cache in bios which adds a bit more omph.

I've tred a ton of MP games.... Cod, Destiny, Gears 5, Apex, Overwatch 2, Battlefield 2042 they all load just on my cache CCD.

Don't play moba or CS or vslerant so don't know about them.

The only program I can find that gives me issues is Timespy for some reason haven't figured that one out yet but all other synthetics are fine.
 
X3d improves gaming performance, but it's certainly not as easy a implementation as i bet AMD had hoped. Especially not in a dual CCD CPU.
 
So basicly 9 series is 7 series with higher clocks?
 
The whole thing is a huge circle jerk, as usual. Computerbase.de cites some rando AMD interview that says nothing. WccfTech cites computerbase.de, Compu cites wccftech..... Why do we even use these people as source. It's been decades and from day 1 we all know wccftech colors. This isn't journalism and I'm out of salt.
 
The whole thing is a huge circle jerk, as usual. Computerbase.de cites some rando AMD interview that says nothing. WccfTech cites computerbase.de, Compu cites wccftech..... Why do we even use these people as source. It's been decades and from day 1 we all know wccftech colors. This isn't journalism and I'm out of salt.

It's the middle of summer nothing else to BS about Arrow lake and next gen gpus are probably 4-8 months out and this comes at the end of the month.

To me it's all just friendly conjecture based on dubious sources nothing wrong with that as long as everyone knows untill reviews hit it don't mean shite.
 
Back
Top