Friday, December 4th 2020

PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed

While I was messing around with an older SSD test system (not benchmarking anything) I wondered why the machine's performance was SO sluggish with the NVIDIA card I just installed. Windows startup, desktop, Internet, everything in Windows would just be incredibly slow. This is an old dual-core machine, but it ran perfectly fine with the AMD Radeon card I used before.
At first I blamed NVIDIA, but when I opened Task Manager I noticed one of my cores sitting at 100%—that can't be right.

Digging a bit further into this, it looks like RadeonSettings.exe is using one processor core at maximum 100% CPU load. Ugh, but there is no AMD graphics card installed right now.

Once that process was terminated manually (right click, select "End task"), performance was restored to expected levels and CPU load was normal again. This confirms that the AMD driver is the reason for the high CPU load. Ideally, before changing graphics card, you should uninstall the current graphics card driver, change hardware, then install the new driver, in that order. But for a quick test that's not what most people do, and others are simply not aware of the fact that a thing called "graphics card driver" exists, and what it does. Windows is smart enough to not load any drivers for devices that aren't present physically.
Looks like AMD is doing things differently and just pre-loads Radeon Settings in the background every time your system is booted and a user logs in, no matter if AMD graphics hardware is installed or not. It would be trivial to add a check "If no AMD hardware found, then exit immediately", but ok. Also, do we really need six entries in Task Scheduler?

I got curious and wondered how it is possible in the first place that an utility software like the Radeon Settings control panel uses 100% CPU load constantly—something that might happen when a mining virus gets installed, to use your electricity to mine cryptocurrency, without you knowing. By the way, all this was verified to be happening on Radeon 20.11.2 WHQL driver, 20.11.3 Beta and the press driver for an upcoming Radeon review.

Unless you're a computer geek you'll probably want to skip over the following paragraphs, I still found the details interesting enough to share with you.

I attached my debugger, looked for the thread that's causing all the CPU load and found this:
Hard to read, translated it into C code it might make more sense:
If you're a programmer you'd have /facepalm'd by now, let me explain. In a multi-threaded program, Events are often used to synchronize concurrently running threads. Events are a core feature of the Windows operating system, once created, they can be set to "signaled", which will notify every other piece of code that is watching the status of this event—instantly and even across process boundaries. In this case the Radeon Settings program will wait for an event called "DVRReadyEvent" to get created, before it continues with initialization. This event gets created by a separate, independent, driver component, that's supposed to get loaded on startup, too, but apparently never does. The Task Scheduler entries in the screenshot above do show "StartDVR". The naming suggests it's related to the ReLive recording feature that lets you capture and stream gameplay. I guess that part of the driver does indeed check if Radeon hardware is present, and will not start otherwise. Since Windows has no WaitForEventToGetCreated() function, the usual approach is to try to open the event until it can be opened, at which point you know that it does exist.

You're probably asking now, "what if the event never gets created?" Exactly, your program will be hung, forever, caught in an infinite loop. The correct way to implement this code is to either set a time limit for how long the loop should run, or count the number of runs and give up after 100, 1000, 1 million, you pick a number—but it's important to set a reasonable limit.

A more subtle effect of this kind of busy waiting is that it will run as fast as the processor can, loading one core to 100%. While that might be desirable if you have to be able to react VERY quickly to something, there's no reason to do that here. The typical approach is to add a short bit of delay inside the loop, which tells the operating system and processor "hey, I'm waiting on something and don't need CPU time, you may run another application now or reduce power". Modern processors will adjust their frequency when lightly loaded, and even power down cores completely, to conserve energy and reduce heat output. Even a delay of one millisecond will make a huge difference here.

This is especially important during system startup, where a lot of things are happening at the same time, that need processor time to complete—it's why you feel you're waiting forever for your desktop to become usable when you start the computer. With Radeon Settings taking over one core completely, there's obviously less performance left for other startup programs to complete.

I did some quick and dirty performance testing in actual gameplay on a 8-core/16-thread CPU and found a small FPS loss, especially in CPU limited scenarios, around 1%, in the order of 150 FPS vs 151 FPS. This confirms that this can be an issue on modern systems, too, even though just 5% of CPU power is lost (one core out of 16). The differences will be minimal though, and it's unlikely you'll subjectively notice the difference.

Waiting on synchronization signals is very basic programming skills, most midterm students would be able to implement it correctly. That's why I'm so surprised to see such low quality code in a graphics driver component that get installed on hundreds of millions of computers. Modern software development techniques avoid these mistakes by code reviews—one or multiple colleagues read your source code and point out potential issues. There's also "unit testing", which requires developers to write testing code that's separate from the main code. These unit tests can then be executed automatically to measure "code coverage"—how many percent of the program code are verified to be correct through the use of unit tests. Let's just hope AMD fixes this bug, it should be trivial.

If you are affected by this issue, just uninstall the AMD driver from Windows Settings - Apps and Features. If that doesn't work, use DDU. It's not a big deal anyway, what's most important is that you are aware, in case your system feels sluggish after a graphics hardware change.
Add your own comment

277 Comments on PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed

#176
z1n0x
nguyenLook what coding oopsies did to the 737 MAX :roll: . After a whole year AMD has managed to iron out the oopsies from their previous drivers, only to make new ones.
Quite a leap you took there, from GPUs to 737 Max. Also are you saying that, Nvidia writes infallible code?

Alot programming gurus in this comment section, maybe you should apply for a job at AMD. Explain to them, how you're going to save them from themselves.
Posted on Reply
#177
TheGoddessInari
Max(IT)it's not rocket science maybe, except AMD developers aren't able to fix a very basic issue (you don't have the hardware, don't load the driver).
Not every user is aware of this problem. But AMD developers should be.
The driver isn't loading at all. Calling the settings application a hardware driver shows how knowledgeable you are about this so-called "issue". This is hyperbole at best, but realistically, it's another pathetic hitpiece against AMD that has no basis in fact or reason.

People keep making drama about this, but there's effectively no issue. Prattling on and beating a dead horse over nothing is just childish behavior. TPU should strive to be better than to devolve to such pettiness, in any event. This is making me really miss HardOCP.
Posted on Reply
#178
CoUsT
I have Radeon GPU and it still puts one thread at 100% usage. Check following image. i.imgur.com/XEbMEW2.png

You can clearly see I can use Radeon settings just fine and it even picks up GPU but it still takes one thread for itself. I just DDU'ed system again and it didn't change anything. I have to kill Radeon Settings process every time I use it.

Might as well downgrade to like 19.something version, at least it was nice and stable and not bloated.
Posted on Reply
#179
rtwjunkie
PC Gaming Enthusiast
TheGoddessInariThe driver isn't loading at all. Calling the settings application a hardware driver shows how knowledgeable you are about this so-called "issue". This is hyperbole at best, but realistically, it's another pathetic hitpiece against AMD that has no basis in fact or reason.

People keep making drama about this, but there's effectively no issue. Prattling on and beating a dead horse over nothing is just childish behavior. TPU should strive to be better than to devolve to such pettiness, in any event. This is making me really miss HardOCP.
I'm glad you think W1zzard did a "hitpiece" against AMD. Just go ahead and ignore the very positive reviews of Ryzens and their latest GPU's. It also means you are completely ignorant of his early days, which were very much ATI related. Do I need to tell you what ATI was? What he did do was use his very good technical knowledge and investigate, then write a clear, level-headed post which would hopefully help one or two people.

I can tell you from experience helping people, regular people don't know squat and do this kind of thing all the time, and don't know how to fix it. This post will explain it and tell them what to do, since it will now like be a search engine result.
Posted on Reply
#180
nguyen
z1n0xQuite a leap you took there, from GPUs to 737 Max. Also are you saying that, Nvidia writes infallible code?
Alot programming gurus in this comment section, maybe you should apply for a job at AMD. Explain to them, how you're going to save them from themselves.
Yeah I should offer AMD a piece of advice: "stop asking your customers to beta test your drivers and hire real beta testers" :roll: .

Well I had some small bugs with Nvidia driver too but they only last for a very short time, like a week or two before a hotfix come out, didn't have to send email to Nvidia or anything.
Posted on Reply
#181
warrior420
srsbsnsThis one simple trick to fix this issue: Don't install drivers and bundled software on a system for a piece of hardware not installed.
Basically this.

Breaking news: Software installed on computer, does stuff on computer, news at 8.
Posted on Reply
#182
TheGoddessInari
CoUsTI have Radeon GPU and it still puts one thread at 100% usage. Check following image. i.imgur.com/XEbMEW2.png

You can clearly see I can use Radeon settings just fine and it even picks up GPU but it still takes one thread for itself. I just DDU'ed system again and it didn't change anything. I have to kill Radeon Settings process every time I use it.

Might as well downgrade to like 19.something version, at least it was nice and stable and not bloated.
Doesn't occur here, you might try reinstalling as you suggested.
rtwjunkieI'm glad you think W1zzard did a "hitpiece" against AMD. Just go ahead and ignore the very positive reviews of Ryzens and their latest GPU's. It also means you are completely ignorant of his early days, which were very much ATI related. Do I need to tell you what ATI was? What he did do was use his very good technical knowledge and investigate, then write a clear, level-headed post which would hopefully help one or two people.

I can tell you from experience helping people, regular people don't know squat and do this kind of thing all the time, and don't know how to fix it. This post will explain it and tell them what to do, since it will now like be a search engine result.
I've been buying products since the very first Radeon 32SDR became available, so no, you don't have to do the whole condescending attitude and assume. People can make mistakes, they're allowed. Publishing this ridiculous farce was a mistake, though, and admitting mistakes is the first step on the road to admitting problems that may exist.
Posted on Reply
#183
Mussels
Freshwater Moderator
Just confirmed i have this bug on my server - i did a GPU swap from an RX 570 to a GT 610 to save on idle wattage, and it's been sitting at 6.7% CPU usage for almost 10 days now....




oops.


edit: i tried clicking the tray icon only for it to vanish and stop wasting CPU power, somethings definitely weird on this one.
Posted on Reply
#184
nguyen
MusselsJust confirmed i have this bug on my server - i did a GPU swap from an RX 570 to a GT 610 to save on idle wattage, and it's been sitting at 6.7% CPU usage for almost 10 days now....




oops.


edit: i tried clicking the tray icon only for it to vanish and stop wasting CPU power, somethings definitely weird on this one.
Hey, it could be using your CPU for mining purposes for all you know :D
Posted on Reply
#185
OneMoar
There is Always Moar
z1n0xQuite a leap you took there, from GPUs to 737 Max. Also are you saying that, Nvidia writes infallible code?

Alot programming gurus in this comment section, maybe you should apply for a job at AMD. Explain to them, how you're going to save them from themselves.
cut the I am not a expert so you can't be one either bullshit please if you don't understand the gravity of why this is bad and your only focus is on MaH UnDerDoG please just show your self out

This is not going to have a job tomarrow bad if they find who ever signed off on this (assuming anybody at AMD cares about code quality which we know they don't ) =\

nobody mentioned Nvidia you did so you can stop with the bias crap

if I was AMD I would hire a reputable code auditing service and have them check there entire code for errors like this odds are there is more
Posted on Reply
#186
W1zzard
MusselsJust confirmed i have this bug on my server - i did a GPU swap from an RX 570 to a GT 610 to save on idle wattage, and it's been sitting at 6.7% CPU usage for almost 10 days now....
That's exactly why I made this post. So people are like "oh wait, could this affect me?" and spend 30 seconds looking at Task Manager. Guess those idle power savings from GT610 were used up by the CPU ;)
Posted on Reply
#187
vega22
so the take away from this are;

1. coders are lazy and,
2. remove software for hardware no longer in the system.

in other news, the sky is blue, grass is green and rain is wet :lol:

before the flames start, i think amd not doing their job is just as bad as leaving software on your system for hardware you removed. both to blame like.
Posted on Reply
#188
mb194dc
AMD bloatware installed with the drivers is very buggy and annoying.

Yesterday I put all OC settings directly in to the bios of my card so don't need use it anymore. Tried using afterburner but it was messing stock bios fan profile up.

Bit extreme maybe but feel much better just using the driver.
Posted on Reply
#189
Hattu
As a "hobby programmer"(*), i've made similar coding mistakes regarding timeouts with unplugged hardware that i made. I think this article was very informative and as i see it, unbiased.

But things escalate quickly.

(*) I started with C64 basic and asm. Then moved on to PCs and Pascal. After that Delphi and AVR asm. It was very hard to learn Windows programming and while i loved coding, every now and then i needed something new that i had to learn. And part of me hated that. Haven't coded a line like in 8 or 10 years, and now, if i want to start again, i must learn a new language, like C...
Posted on Reply
#190
b1k3rdude
What am I missing here? @Wizzard you forgot to remove the driver and the software before pulling the gfx card or the coding in the associated driver application is crap. YOU as the user are always supposed to uninstall driver...!!! This WHOLE article could have been condensed down a single line -

"When removing a gfx card remember to always uninstall the driver & associated application, if that doesn't work, use DDU. "
Posted on Reply
#191
Mussels
Freshwater Moderator
Are people not getting that this is just reporting a bug?


Like.... it's a PSA not a call to arms.
Posted on Reply
#192
efikkan
R-T-BThere is literal evidence of shitty coding practices in the OP of this article. It's not FUD at this point, it's a question of how deep the rabbit hole goes.
You SHOULD be asking yourself how much "fine wine" could be gained by fixing all the sure-to-be-found similar crap in the driver. It could be extraordinary.
Like I argued in #40, the problem Wizz found here is just a symptom of something bigger, the tip of the ice berg if you will. In terms of debugging, we are actually "lucky" when bugs cause consistent stalls or crashes, those are easy to attach a debugger and find, and should be found by AMD if they did proper testing. Most synchronization issues are often much harder to reproduce consistently, and often disappear when you attach a debugger.

I disagree about AMD just fixing similar crap and getting extraordinary results. Don't get me wrong, every bug should be fixed, but the inconsistent reliability issues I've seen over many years with AMD drivers tells me there is probably some larger "design flaw". If this was easily fixable, AMD would have fixed it a long time ago.
R-T-BMe too. My experience (and attempt to help others) with the 5700 XT is well documented here on the forums. In particular, DX11 cpu overhead is absurd.
Perhaps the overhead is "absurd" if you make an isolated test case, but it's not absurd in practice.
Nevertheless, AMD could easily do what Nvidia did, by bringing most of the driver side improvements of DirectX 12 to 11, but that would ruin the image of AMD being better at DirectX 12 though.
R-T-BWith all due respect, no commercial programmer should make a mistake like this ever. It's... I guess you just have to be a programmer to understand. It's like trying to hard boil an egg without water. It shows you have no business in the kitchen.
Really?
Have you worked at code bases of 100.000s or millions of lines of code, possibly with an awful complex structure?
Keep in mind that we are talking about a minor "glitch" here, which could be either a careless mistake or even the result of a bad merge. All programmers do small mistakes, and I'll be the first one to admit doing some embarrassing ones, but what really shows programming skills (or lack thereof) is how problems are solved, not a tiny mistake. And I mean no disrespect here, but having such attitudes as an engineer is not healthy.

One of the bigger problems I've had in development teams over the years is that lesser coders don't dare to challenge my work, even when I've strongly encouraged them to try to break it. So getting good QA can sometimes be challenging.
lexluthermiesterEvidence of a mistake, yes. However, the problem that exists strikes me as one that is fairly complicated and not something that could have been anticipated as a potential issue. This was an honest mistake much like the ones Intel, NVidia, Apple, Google, Microsoft, Adobe, etc, etc have made. Computer code is extremely complicated. People really need to stop making mountains out of mole-hills.
Mostly true, yes.
But regarding anticipating issues; all such software projects should have routines designed to validate that a release is working reasonably well. While I don't expect anyone to never make a bug, it is astonishing that they didn't test if the driver behaved erratically in a system with a different GPU present, this should certainly be in their test suite.
Edit: Let me take another example; some years ago AMD managed to ship two drivers in a row, both failing to compile most GLSL shaders, even basic ones. I still don't understand how it's "possible" to ship a driver without validating basic stuff like this.
b1k3rdudeWhat am I missing here? @Wizzard you forgot to remove the driver and the software before pulling the gfx card or the coding in the associated driver application is crap. YOU as the user are always supposed to uninstall driver...!!! This WHOLE article could have been condensed down a single line -

"When removing a gfx card remember to always uninstall the driver & associated application, if that doesn't work, use DDU. "
This nonsense has been debunked several times, there are many reasons to have different GPUs present, such as;
APU + GPU
Developers or other engineers having multiple GPUs for various compute and simulations

Even APIs like DirectX 12 and Vulkan is designed to work with multiple GPUs from different makes. There is simply no excuse when a driver suite don't handle this.
Posted on Reply
#193
lexluthermiester
vega221. coders are lazy and,
Nope. Do we really need the insults?
vega222. remove software for hardware no longer in the system.
Yup. That's just a good rule of thumb.
MusselsLike.... it's a PSA not a call to arms.
Exactly!
Posted on Reply
#194
b1k3rdude
efikkan
  • there are many reasons to have different GPUs present, such as - APU + GPU. Developers or other engineers having multiple GPUs for various compute and simulations
  • There is simply no excuse when a driver suite don't handle this.
  • Correct and I have done this myself on a few occasions.
  • There is never an excuse for poorly behaving software, but leaving the driver & associated application installed after the relevent GPU is no longer present is just asking for trouble.
Posted on Reply
#195
efikkan
b1k3rdudeThere is never an excuse for poorly behaving software, but leaving the driver & associated application installed after the relevent GPU is no longer present is just asking for trouble.
These are contradictory statements.
If your software can't handle other hardware being present, then your software is broken.
And as I said, both DirectX 12 and Vulkan is designed to have a mix of GPUs, so they should be aware of this and test it before shipping a new driver.
Posted on Reply
#196
Nater
No time to read the whole thread, but I get a sense of the back and forth now...

Has anyone here independently reproduced the issue?
Posted on Reply
#197
TheoneandonlyMrK
efikkanThese are contradictory statements.
If your software can't handle other hardware being present, then your software is broken.
And as I said, both DirectX 12 and Vulkan is designed to have a mix of GPUs, so they should be aware of this and test it before shipping a new driver.
Clearly not worked in a test department.
That's a Op reading fail right there.

The problem occurs when hardware ISN'T there but the software for it IS.

Who uses Mgpu with multiple cards missing and how?.

To test for this AMD's test department would have had to do similar, remove their GPU and use Dgpu or fit a competition GPU while leaving their software on, I have done application testing ,you stick to thing's you expect to happen not such outliers typically.

Now a code review could find these issues and really is required at this point but some of your expectations for testing are ridiculous IMHO.
Posted on Reply
#198
rtwjunkie
PC Gaming Enthusiast
NaterNo time to read the whole thread, but I get a sense of the back and forth now...

Has anyone here independently reproduced the issue?
Yes, if you read the thread: @Mussels did on his server.
TheGoddessInariDoesn't occur here, you might try reinstalling as you suggested.



I've been buying products since the very first Radeon 32SDR became available, so no, you don't have to do the whole condescending attitude and assume. People can make mistakes, they're allowed. Publishing this ridiculous farce was a mistake, though, and admitting mistakes is the first step on the road to admitting problems that may exist.
No way of knowing if you knew what ATI was, that’s why I asked. Forums get filled with new members who are young and think anything from 5 years ago is ancient history, and know nothing of the past while thinking they have all the answers. So no, not condescending.

Second, the only reason this is a “farce” to you is because you’re obviously acting defensive. It was a clear, in depth investigation into a problem, using coding skills, and published as a bug report and advisory. I’m sorry you live in such a perfect world that no one should ever learn anything from other’s mistakes (W1zzard’s and AMD’s both). There, now THAT was condescending.
Posted on Reply
#199
R-T-B
efikkanPerhaps the overhead is "absurd" if you make an isolated test case, but it's not absurd in practice.
In the sense that it impacts actual fps in a non-cpu bound game, no, it's not.

In the sense that it is in actual cpu usage vs the competition (and that will hurt cpubound games), yes, it most certainly is. I've looked into this a lot. dxvk outperforms their dx11 driver in cpu overhead.
Posted on Reply
#200
Vayra86
z1n0xI'm sure, you will be able to share with us, some personal stories about the talentless AMD employees and how they condone shitty quality work.
Driver oopsies happens with everyone. Many WHQL Nvidia drivers have Hotfix releases.
I don't know about microcode problems with AMD, what i do know is that, i had to update my Intel chipset firmware yet again, because of 20+ CVE's.
The amount of times i had to update IME firmware because of CVE's is mind-boggling.
Its very well possible AMD has talented engineers on those drivers, but if they do, they sure don't get to do their job right. Mismanagement can be a cause of bad software releases as much as lack of talent. Or both. Who knows, I just judge the results. AMD driver issues happen and they're often pretty influential on the experience, and fixes don't always come as quickly as you'd want. Nvidia has its oopsies too, but fixes a lot faster, and the magnitude of those oopsies is often less impactful.
Posted on Reply
Add your own comment
May 10th, 2024 02:57 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts