• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Introduces Dynamic Local Mode for Threadripper: up to 47% Performance Gain

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,648 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
AMD has made a blog post describing an upcoming feature for their Threadripper processors called "Dynamic Local Mode", which should help a lot with gaming performance on AMD's latest flagship CPUs.



Threadripper uses four dies in a multi-chip package, of which only two have a direct access path to the memory modules. The other two dies have to rely on Infinity Fabric for all their memory accesses, which comes with a significant latency hit. Many compute-heavy applications can run their workloads in the CPU cache, or require only very little memory access; these are not affected. Other applications, especially games, spread their workload over multiple cores, some of which end up with higher memory latency than expected, which results in a suboptimal performance.



The concept of multiple processors having different memory access paths is called NUMA (Non-uniform memory access). While technically it is possible for software to detect the NUMA configuration and attach each thread to the ideal processor core, most applications are not NUMA aware and the adoption rate is very slow, probably due to the low number of systems using such a concept.



In ThreadRipper, using Ryzen Master, users are free to switch between "Local Memory Access" mode or "Distributed Memory Access" mode, with the latter being the default for ThreadRipper, resulting in highest compute application performance. Local Mode on the other hand is better suited to games, but switching between the modes requires a reboot, which is very inconvenient for users.

AMD's new "Dynamic Local Mode" seeks to abolish that requirement by introducing a background process that continually monitors all running applications for their CPU usage and pushes the more busy ones onto the cores that have direct memory access, by adjusting their process affinity mask, which selects which processors the application is allowed to be scheduled on. Applications that require very little CPU are in turn pushed onto the cores with no memory access, because they are not so important for fast execution.



This update will be available starting October 29 in Ryzen Master, and will be automatically enabled unless the user manually chooses to disable it. AMD also plans to open the feature up to even more users by including Dynamic Local Mode as a default package in the AMD Chipset Drivers.

View at TechPowerUp Main Site
 
good for gaming, it seems
 
Not sure why this isn't in the CPU driver and needs a whole application to handle it.
 
Because it is changing affinity of processes. Good luck getting a driver to do that under WDF and getting WHQL certification.

Not sure why this isn't in the CPU driver and needs a whole application to handle it.
 
Good job AMD..
 
Nice but until it makes it's way into the regular Ryzen lines, just won't have enough impact.
 
Nice! Better gaming performance is always warmly welcomed.
One more reason why a developer AND gamer can consider buying a Threadripper.
 
Nice but until it makes it's way into the regular Ryzen lines, just won't have enough impact.

AFAIK Ryzen doesn't use NUMA.
 
This is great competition and will help to put the wind up Intel. It all depends on that framerate improvement though, of course.

It's a better architecture from a performance standpoint to have a monolithic design like Intel, but AMD's approach has the advantage of scalability, which can be just as important.
 
My dual xeon uses numa. :)
 
If it could handles automatically, great.
 
Nice but until it makes it's way into the regular Ryzen lines, just won't have enough impact.

Regular Ryzen CPUs don't need this. This optimization is useless because Regular Ryzen CPUs only use just a die.
 
This is great competition and will help to put the wind up Intel. It all depends on that framerate improvement though, of course.

It's a better architecture from a performance standpoint to have a monolithic design like Intel, but AMD's approach has the advantage of scalability, which can be just as important.

Not really, Threadripper isn't something I'd buy primarily for gaming ;)
But as always, improving on the weak spots makes the whole package stronger overall.
 
Interesting stuff, hopefully will be useful to close the gap with intel in terms of gaming performance.
 
Because it is changing affinity of processes. Good luck getting a driver to do that under WDF and getting WHQL certification.

You have been here since 2006 and never filled out your specs and hardly post, yet you would on a Topic relating to AMD. Hmmm
 
Saying, "Jim Keller is a Genius." might be an understatement.
Jim keller has been at intel since the launch of OG ryzen. This is mostly AMD's own work.
 
You have been here since 2006 and never filled out your specs and hardly post, yet you would on a Topic relating to AMD. Hmmm

This whole comment seems wholly pointless so I thought I'd add another one.
 
Not really, Threadripper isn't something I'd buy primarily for gaming ;)
But as always, improving on the weak spots makes the whole package stronger overall.
I'm replying generally, not your particular case. And anyway, if it gives a better framerate in games, it will also give a better general application performance.

Before I get "corrected" by anyone, I'm talking about a like for like comparison, eg a 16 core Intel CPU compared with a 16 core AMD CPU with the same, or very similar clock speeds. A better IPC from Intel, coupled with a monolithic design will result in a performance win for Intel in everything.
 
Jim keller has been at intel since the launch of OG ryzen. This is mostly AMD's own work.
On a projects that's Jim Keller's brain child.
As he himself acknowledges, it's always more about the team, one man can't possibly do all the work to build a CPU these days. Still, a series of highly successful CPUs trail his work history.

I'm replying generally, not your particular case. And anyway, if it gives a better framerate in games, it will also give a better general application performance.

Before I get "corrected" by anyone, I'm talking about a like for like comparison, eg a 16 core Intel CPU compared with a 16 core AMD CPU with the same, or very similar clock speeds. A better IPC from Intel, coupled with a monolithic design will result in a performance win for Intel in everything.
Well, I still have to correct you :D
Intel's IPC is virtually the same as AMD's now (within a few percent). Intel's win is because of higher frequencies.
 
On a projects that's Jim Keller's brain child.
As he himself acknowledges, it's always more about the team, one man can't possibly do all the work to build a CPU these days. Still, a series of highly successful CPUs trail his work history.


Well, I still have to correct you :D
Intel's IPC is virtually the same as AMD's now (within a few percent). Intel's win is because of higher frequencies.
AMD is still, what, 10% slower on IPC? That's still a win for Intel. Also, notice that I said Intel would win, not by how much. That depends on specific cases which is outside the scope of my comment.
 
Makes me all warm inside, but really needs to be in the driver, unless I am missing something.
 
So AMD can fix what Microsoft can't fix in the scheduler.

Because Microsoft is too lazy to

I'm replying generally, not your particular case. And anyway, if it gives a better framerate in games, it will also give a better general application performance.

Before I get "corrected" by anyone, I'm talking about a like for like comparison, eg a 16 core Intel CPU compared with a 16 core AMD CPU with the same, or very similar clock speeds. A better IPC from Intel, coupled with a monolithic design will result in a performance win for Intel in everything.

Considering the scheduler is stupid in Windows. This would fix the gaming performance through Ryzen/Threadripper master like they advertised on AMD about the 2990WX CPU.

Either way Its a win for either team.
 
Considering the scheduler is stupid in Windows. This would fix the gaming performance through Ryzen/Threadripper master like they advertised on AMD about the 2990WX CPU.

Either way Its a win for either team.
Yes, it would certainly improve performance and I'm quite keen to see how close AMD can get to Intel with this fix.
 
Back
Top