Friday, December 4th 2020

PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed

While I was messing around with an older SSD test system (not benchmarking anything) I wondered why the machine's performance was SO sluggish with the NVIDIA card I just installed. Windows startup, desktop, Internet, everything in Windows would just be incredibly slow. This is an old dual-core machine, but it ran perfectly fine with the AMD Radeon card I used before.
At first I blamed NVIDIA, but when I opened Task Manager I noticed one of my cores sitting at 100%—that can't be right.

Digging a bit further into this, it looks like RadeonSettings.exe is using one processor core at maximum 100% CPU load. Ugh, but there is no AMD graphics card installed right now.

Once that process was terminated manually (right click, select "End task"), performance was restored to expected levels and CPU load was normal again. This confirms that the AMD driver is the reason for the high CPU load. Ideally, before changing graphics card, you should uninstall the current graphics card driver, change hardware, then install the new driver, in that order. But for a quick test that's not what most people do, and others are simply not aware of the fact that a thing called "graphics card driver" exists, and what it does. Windows is smart enough to not load any drivers for devices that aren't present physically.
Looks like AMD is doing things differently and just pre-loads Radeon Settings in the background every time your system is booted and a user logs in, no matter if AMD graphics hardware is installed or not. It would be trivial to add a check "If no AMD hardware found, then exit immediately", but ok. Also, do we really need six entries in Task Scheduler?

I got curious and wondered how it is possible in the first place that an utility software like the Radeon Settings control panel uses 100% CPU load constantly—something that might happen when a mining virus gets installed, to use your electricity to mine cryptocurrency, without you knowing. By the way, all this was verified to be happening on Radeon 20.11.2 WHQL driver, 20.11.3 Beta and the press driver for an upcoming Radeon review.

Unless you're a computer geek you'll probably want to skip over the following paragraphs, I still found the details interesting enough to share with you.

I attached my debugger, looked for the thread that's causing all the CPU load and found this:
Hard to read, translated it into C code it might make more sense:
If you're a programmer you'd have /facepalm'd by now, let me explain. In a multi-threaded program, Events are often used to synchronize concurrently running threads. Events are a core feature of the Windows operating system, once created, they can be set to "signaled", which will notify every other piece of code that is watching the status of this event—instantly and even across process boundaries. In this case the Radeon Settings program will wait for an event called "DVRReadyEvent" to get created, before it continues with initialization. This event gets created by a separate, independent, driver component, that's supposed to get loaded on startup, too, but apparently never does. The Task Scheduler entries in the screenshot above do show "StartDVR". The naming suggests it's related to the ReLive recording feature that lets you capture and stream gameplay. I guess that part of the driver does indeed check if Radeon hardware is present, and will not start otherwise. Since Windows has no WaitForEventToGetCreated() function, the usual approach is to try to open the event until it can be opened, at which point you know that it does exist.

You're probably asking now, "what if the event never gets created?" Exactly, your program will be hung, forever, caught in an infinite loop. The correct way to implement this code is to either set a time limit for how long the loop should run, or count the number of runs and give up after 100, 1000, 1 million, you pick a number—but it's important to set a reasonable limit.

A more subtle effect of this kind of busy waiting is that it will run as fast as the processor can, loading one core to 100%. While that might be desirable if you have to be able to react VERY quickly to something, there's no reason to do that here. The typical approach is to add a short bit of delay inside the loop, which tells the operating system and processor "hey, I'm waiting on something and don't need CPU time, you may run another application now or reduce power". Modern processors will adjust their frequency when lightly loaded, and even power down cores completely, to conserve energy and reduce heat output. Even a delay of one millisecond will make a huge difference here.

This is especially important during system startup, where a lot of things are happening at the same time, that need processor time to complete—it's why you feel you're waiting forever for your desktop to become usable when you start the computer. With Radeon Settings taking over one core completely, there's obviously less performance left for other startup programs to complete.

I did some quick and dirty performance testing in actual gameplay on a 8-core/16-thread CPU and found a small FPS loss, especially in CPU limited scenarios, around 1%, in the order of 150 FPS vs 151 FPS. This confirms that this can be an issue on modern systems, too, even though just 5% of CPU power is lost (one core out of 16). The differences will be minimal though, and it's unlikely you'll subjectively notice the difference.

Waiting on synchronization signals is very basic programming skills, most midterm students would be able to implement it correctly. That's why I'm so surprised to see such low quality code in a graphics driver component that get installed on hundreds of millions of computers. Modern software development techniques avoid these mistakes by code reviews—one or multiple colleagues read your source code and point out potential issues. There's also "unit testing", which requires developers to write testing code that's separate from the main code. These unit tests can then be executed automatically to measure "code coverage"—how many percent of the program code are verified to be correct through the use of unit tests. Let's just hope AMD fixes this bug, it should be trivial.

If you are affected by this issue, just uninstall the AMD driver from Windows Settings - Apps and Features. If that doesn't work, use DDU. It's not a big deal anyway, what's most important is that you are aware, in case your system feels sluggish after a graphics hardware change.
Add your own comment

277 Comments on PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed

#101
lexluthermiester
R-T-BOh it's completely unintentional, but that doesn't make it any less stupid.
The more I look at the details the more I think it's an honest mistake, not a stupid one. Probably easily corrected with a quick patch.
R-T-BFor a rather extreme analogy, Chernobyl was unintentional. It was also completely brought about by lax practices.
Yup, that's out there... :kookoo::roll:
KhonjelYou don't get me.
No, I got you. I think you're not being very objective and holding a grudge.
Posted on Reply
#102
R-T-B
lexluthermiesterYup, that's out there...
lol, Didn't say it wasn't, but it's the same degree of facepalm inducing error that you know was followed by others.

The shoe fits on the error foot, if not the severity one.
Posted on Reply
#103
Sybaris_Caesar
lexluthermiesterNo, I got you. I think you're not being very objective and holding a grudge.
I'm not claiming to be the bastion of objectivity. But yes I know that everyone fucks up software sometimes. And otoh AMD produces good software sometimes. Anyway I'll end this back-and-forth. It's going nowhere.
Posted on Reply
#104
W1zzard
XzibitWell he did the mistake on a SSD test system. If hes doing that on a test systems i.e. not cleaning it while switching hardware and drivers. Its not far fetched what else has happened we havent had a PSA for.
Of course I didn't test (benchmark) anything like that, you think I don't know? I tested a new build of GPU-Z using Remote Debugger, on NVIDIA, to check some code changes. Benchmark performance is irrelevant in this case
john_He run the game's benchmark. Ultra settings in game, stock settings for the card and Ryzen, 50 fps at 1440p. Your testing reported 39.6 fps.
The benchmark is useless, it does not represent actual gaming performance. Play the game, stand in Athens, what's the FPS? Turn around, what's the FPS now?
TheGoddessInariAn application that goes with hardware you've uninstalled wasn't removed, and doesn't know the hardware was removed
Of course it can know that the hardware isn't present anymore. Otherwise we wouldn't be seeing this problem at all. The DVR software does check for AMD hardware and doesn't start, the Settings software does not, and waits forever for the DVR software to start.
TheGoddessInarimulti-GPU bugs are left in GPU-Z
More details please, ideally in the GPU-Z bug report forum.
ScaLibBDPBe careful when doing performance evaluations because such issues could skew results
They definitely will, which is why I mentioned the 0.5% FPS difference even on a 8c/16t system. I wouldn't be surprised if this affected some less experienced reviewers. As mentioned above I did not benchmark anything on this dual-core machine.
r9You better explain yourself what you doing with dual core!
See beginning of this post. It's an old system that's useful to have around, and it works surprisingly well for these tasks.
srsbsnsIt doesn't care about the why. Its software. You can explain why you tried using an unsupported graphics card all day but it doesnt change anything. AMD might need to patent some sort of mechanism that comes out of the PC and slaps the GPU out of your hand. Until that time comes software can only stop you from installing when you have an incompatible GPU in.
This is not how it works. Windows enumerates all hardware on bootup and does not load drivers for devices that are not present. Go to Device Manager and check "Show Hidden Devices". All these devices are "installed" in your system, and have the driver ready to go.
MusselsIt's a bug not a conspiracy
lexluthermiesterI really don't think AMD did this deliberately. It's got to be an unintentional problem...
Of course not, this is just a mistake
hurakuraI guess next to test is nVidia driver on a system whit Intel CPU and AMD gpu?
That's a great idea. If I find something, do you want me to post an article?
Posted on Reply
#105
_Flare
When ReLive was invented it made some problems around 2018 or so.
after quick googling i found this
community.amd.com/t5/drivers-software/amd-relive-host-application-high-cpu-usage/td-p/92347
back in the day some pointed at this problem only coming up when intel igpu is present. Some needed to delete the DVR exe. in C:/program files/AMD/C Next/Cnext
Some even couldn´t install HWinfo after that problem occured.
i my head was until today my own reminder: Dont install ReLive ever.
Posted on Reply
#106
kruk
@W1zzard: I somehow can't reply or quote directly, so I'm just pasting it:
You're probably asking now, "what if the event never gets created?" Exactly, your program will be hung, forever, caught in an infinite loop. The correct way to implement this code is to either set a time limit for how long the loop should run, or count the number of runs and give up after 100, 1000, 1 million, you pick a number—but it's important to set a reasonable limit.
Yes, the segment should definitely stop after reaching a threshold, but what if this check was always performed elsewhere and was removed at some point? Then the programmer that wrote this section didn't do anything wrong. In large codebases with big teams, this can happen. It could be a simple communication error ... Since QA probably doesn't test this on computers without Radeon GPUs, it's easy to see why it went undetected.

I hope you reported the bug ...
A more subtle effect of this kind of busy waiting is that it will run as fast as the processor can, loading one core to 100%. While that might be desirable if you have to be able to react VERY quickly to something, there's no reason to do that here. The typical approach is to add a short bit of delay inside the loop, which tells the operating system and processor "hey, I'm waiting on something and don't need CPU time, you may run another application now or reduce power". Modern processors will adjust their frequency when lightly loaded, and even power down cores completely, to conserve energy and reduce heat output. Even a delay of one millisecond will make a huge difference here.
That is fine in theory, but what if debounced listener has problems catching the signals? Then this process might take even longer or be stuck in an infinite loop forever even if you have a Radeon GPU. Just a guess ... :)
Posted on Reply
#107
Frick
Fishfaced Nincompoop
1sanpedro1This article is such crap. Usually I like reading techpowerup, but a professional video card reviewer who doesn't do a DDU when switching GPUs??? Then saying there is a definite performance drop, 150 vs 151, a whole .5%. That sounds within margin of error. To top it off, there's no test to see if this is true going from Nvidia to AMD.

Then there are people on here who seem to believe that there's no need to uninstall drivers for major hardware changes. I'm sorry, but a GPU is a major hardware change. Do you guys also not cleanup your systems when you change from CPU vendors or change chipsets?
I've swapped systems entirely without reinstalling Windows (10), sometimes it has worked fine, sometimes less fine. I long for the day I never have to reinstall Windows again, but stuff like this highlights that we're not there yet and may never be.
Posted on Reply
#108
W1zzard
krukThat is fine in theory, but what if debounced listener has problems catching the signals?
Not 100% sure what you are asking, but the OS makes certain guarantees about Events, they are unique and thread-safe
Posted on Reply
#109
TheGoddessInari
More details please, ideally in the GPU-Z bug report forum.
You commented on the bug several times, and said GPU reporting 0/0mhz values on Vega cards in multi-GPU configs wasn't an issue with your software (and only your software specifically, everything else has no issues), while leaving it unfixed forever.

You're really doing a bang-up job with this "unbiased" thing.

Next are you going to write a hitpiece about ARM and Android, while saying that iOS devices are absolutely not locked down? :P
Posted on Reply
#110
laszlo
even an inexperienced user won't install amd drivers for nv card ...only if he really don't know the difference

when changing gpu and manufacturer the old driver usually is uninstalled(by user-edited for any misunderstanding...) but it may happen to remain...

this issue may affect mostly the reviewers if they just swap the cards in the test system
Posted on Reply
#111
john_
W1zzardThe benchmark is useless, it does not represent actual gaming performance. Play the game, stand in Athens, what's the FPS? Turn around, what's the FPS now?
When comparing GPUs the whole story is to compare them under the same conditions and with the same parameters. You said that
W1zzardFPS depends a lot on the game location, or did he play the benchmark? I use actual gameplay. Cities vs outside huge difference. AC:O is also difficult to repro run-to-run because dynamic weather and other random events
When you point at those difficulties in getting accurate readings, I believe you have to use the benchmark. Even if you don't like it. Because you want to compare video cards, or cpus under the absolute same conditions. You do extra performance tests when you are reviewing the game to see how specific hardware, does in all those different situations and how the game's graphic engine performs in different areas of the game. In a city with multiple NPCs visible, in the woods with plenty of vegetation, in an open field with just that dynamic weather. Running the game under a specific scenario, for example in a city, could be making it more CPU bound giving an extra advantage in high IPC CPUs, in the woods maybe it needs higher memory bandwidth from the graphics card favoring those cards and in an open field maybe it can hide in a small degree the differences in performance between different hardware because the scene is less demanding from both CPU and GPU.
Posted on Reply
#112
0x4452
laszlothis issue may affect mostly the reviewers if they just swap the cards in the test system
I think that's a very good point. If a reviewer just swaps GPUs without uninstalling drivers, the NVIDIA card will be disadvantaged because the AMD driver will hose up a CPU core.
Posted on Reply
#113
TheGoddessInari
0x4452I think that's a very good point. If a reviewer just swaps GPUs without uninstalling drivers, the NVIDIA card will be disadvantaged because the AMD driver will hose up a CPU core.
If a reviewer does this, they're literally not doing their job, and would get fired.
Posted on Reply
#114
turbogear
I always use DDU (Display Driver Uninstaller) before I change my GPU especially when switching from AMD to NVIDIA or vise versa. :rolleyes:

Of course that would be horror cenario for W1zzard who is testing lot of cards and uninstalling drivers after every GPU change will be terrible. :banghead:

I did the same when I went from Radeon VII to 6800XT although on the same drive version for both but looking at for example Wattman I see that 6800XT has completely different menu.
I am not sure if the Radeon Software would do proper setting if I would have just switched cards without uninstalling driver first though installing back same version again. :confused:
In any case I did not wanted to take the risk of having unexplained problems. :laugh:
Posted on Reply
#115
efikkan
hurakuraSo AMD bad, Intel and nVidia good. This is BS article to make AMD look bad. Why do you have drivers installed for a hardware not in the PC? I guess next to test is nVidia driver on a system whit Intel CPU and AMD gpu?
:banghead:
So should we avoid showcasing bugs for AMD then, just in case you fanboys can't handle it? :rolleyes:
1sanpedro1This article is such crap. Usually I like reading techpowerup, but a professional video card reviewer who doesn't do a DDU when switching GPUs???
He provided a disassembly of a bug, then provided a translation to C for clarity along with an explanation.
This is article is far more advanced than most other tech sites would publish, even the renowned Anandtech's "in-depth" articles usually only scratches the surface, and only sound technical to non-engineers.
1sanpedro1Then there are people on here who seem to believe that there's no need to uninstall drivers for major hardware changes. I'm sorry, but a GPU is a major hardware change. Do you guys also not cleanup your systems when you change from CPU vendors or change chipsets?
You must be kidding, right?
Do you really expect every user to know how to use DDU when switching GPUs? Then at least shouldn't this be shown as a big disclaimer before buying such products?
lexluthermiesterI really don't think AMD did this deliberately. It's got to be an unintentional problem...
In 99.999% of cases such bugs are accidents, negligence, incompetence, etc.
I seriously hope no one in here thinks such bugs are intentional.
laszlowhen changing gpu and manufacturer the old driver usually is uninstalled but it may happen to remain...
What? :eek:
It's not uncommon for developers to have multiple GPUs from different makers.

If a system is "broken" because a driver is not uninstalled, then the driver is broken, end of discussion.
Posted on Reply
#116
laszlo
efikkanWhat? :eek:
It's not uncommon for developers to have multiple GPUs from different makers.

If a system is "broken" because a driver is not uninstalled, then the driver is broken, end of discussion.
you have on your pc 2 dedicated gpu's one amd and one nv?

we don't know if this behavior also appear on system with amd apu's mixed with dedicated nv gpu so we need feedback from people in this situation... not valid as hardware is always there so no conflict...
Posted on Reply
#118
Mussels
Freshwater Moderator
KhonjelYou don't get me. When I see AMD's lacklustre Ray Tracing presentation and support in games I'm not disappointed. If AMD's DLSS counterpart sucks I won't be disappointed. It's AMD after all. AMD never invested in Crossfire so much that SLI is synonymous with multi-GPU, I never cared. It's AMD after all. AMD never had good OpenGL and DirectX 11 drivers but I wasn't disappointed, it's AMD after all. This guy in our forum can't play his games right now. I'm not disappointed cause it's AMD after all. This guy sold his 5600 XT after battling with issues for a year and I'm not disappointed. It's AMD after all. AMD locked 6800 XT to 2.8 Ghz and 6900 XT to 3.0 Ghz when they could literally have overclocking champs. Buildzoid the self-proclaimed AMD fan might be disappointed, der8auer could be disappointed but I'm not.

I just automatically set my expectations and hopes low for any AMD Radeon product.
You think gaming on 8GB of ram is AMD's fault?

Perhaps you're biased and reaching for examples...
laszloyou have on your pc 2 dedicated gpu's one amd and one nv?

we don't know if this behavior also appear on system with amd apu's mixed with dedicated nv gpu so we need feedback from people in this situation... not valid as hardware is always there so no conflict...
I totally have a few times, it's called onboard graphics. i've also done it before testing GPU's, VBIOS flashing, etc etc. It's not common for a home user, but its also commonly problem free
Posted on Reply
#119
Flaky
wolfThis sort of constructive journalism gets issues fixed.
This!

Sad but true - bad press is often the only way to force corporations to fix their products. When contacted directly by user, they'll either not respond, straight up deny existence of a problem, or forward to customer support, with endless loop of "did you try to turn it off and on" and "did you try uninstalling and installing"...
And it doesn't matter if that's a bug affecting performance, or a security vulnerability.
There's no such thing as being nice. Bug? Make it loud. Vulnerability? Send emails to appropriate places, and file a CVE with 90 days disclosure deadline.
laszlowhen changing gpu and manufacturer the old driver usually is uninstalled but it may happen to remain...
"is"? Driver package doesn't uninstall itself automagically. The driver itself (.sys file) isn't loaded by OS when there's no GPU, but the settings app and services are still run in background.
Posted on Reply
#120
laszlo
MusselsYou think gaming on 8GB of ram is AMD's fault?

Perhaps you're biased and reaching for examples...



I totally have a few times, it's called onboard graphics. i've also done it before testing GPU's, VBIOS flashing, etc etc. It's not common for a home user, but its also commonly problem free
onboard graphic don't count in this case as hardware wise is always present so driver or not it can't create any issues
FlakyThis!

Sad but true - bad press is often the only way to force corporations to fix their products. When contacted directly by user, they'll either not respond, straight up deny existence of a problem, or forward to customer support, with endless loop of "did you try to turn it off and on" and "did you try uninstalling and installing"...
And it doesn't matter if that's a bug affecting performance, or a security vulnerability.
There's no such thing as being nice. Bug? Make it loud. Vulnerability? Send emails to appropriate places, and file a CVE with 90 days disclosure deadline.


"is"? Driver package doesn't uninstall itself automagically. The driver itself (.sys file) isn't loaded by OS when there's no GPU, but the settings app and services are still run in background.
i'm aware it won't uninstall by itself but user can do it and when changing manufacturer is common sens no?
Posted on Reply
#121
W1zzard
john_When you point at those difficulties in getting accurate readings, I believe you have to use the benchmark. Even if you don't like it. Because you want to compare video cards, or cpus under the absolute same conditions. You do extra performance tests when you are reviewing the game to see how specific hardware, does in all those different situations and how the game's graphic engine performs in different areas of the game. In a city with multiple NPCs visible, in the woods with plenty of vegetation, in an open field with just that dynamic weather. Running the game under a specific scenario, for example in a city, could be making it more CPU bound giving an extra advantage in high IPC CPUs, in the woods maybe it needs higher memory bandwidth from the graphics card favoring those cards and in an open field maybe it can hide in a small degree the differences in performance between different hardware because the scene is less demanding from both CPU and GPU.
I guess this will end up "agree to disagree", but for everyone else:

I do compare them under the same conditions, using my own test scene. I played through the whole game to pick that scene, and I do claim my results will give you a more accurate representation than the benchmark of what to expect when you play the game. Obviously you can always pick a spot in any game that will give you different results than any other result. If you prefer to play the benchmark, so be it, look at other reviews. Don't you think my life would be MUCH easier if I just tested the benchmark, vs playing the same scene for hundreds of times? Another problem with nearly every integrated benchmark is that you are taking results off a cold card, which will boost much higher. For 30 seconds, and then performance drops. Ask your favorite reviewers about that.

Ultimately you'll have to trust me a little bit to do the right thing, if you don't, then you should absolutely not read my reviews.
Flakybut the settings app and services are still run in background.
By default, Windows will not start anything in background for non-present devices. You actually have to do work (and not care about your non-customers) to launch something, separately from the OS logic
Posted on Reply
#122
laszlo
W1zzardI guess this will end up "agree to disagree", but for everyone else:

I do compare them under the same conditions, using my own test scene. I played through the whole game to pick that scene, and I do claim my results will give you a more accurate representation than the benchmark of what to expect when you play the game. Obviously you can always pick a spot in any game that will give you different results than any other result. If you prefer to play the benchmark, so be it, look at other reviews. Don't you think my life would be MUCH easier if I just tested the benchmark, vs playing the same scene for hundreds of times? Another problem with nearly every integrated benchmark is that you are taking results off a cold card, which will boost much higher. For 30 seconds, and then performance drops. Ask your favorite reviewers about that.

Ultimately you'll have to trust me a little bit to do the right thing, if you don't, then you should absolutely not read my reviews.


By default, Windows will not start anything in background for non-present devices. You actually have to do work (and not care about your non-customers) to launch something, separately from the OS logic
@W1zzard is this "bug" present in older driver also or it can be pinned exactly from when appeared if case?
Posted on Reply
#123
W1zzard
laszlo@W1zzard is this "bug" present in older driver also or it can be pinned exactly from when appeared if case?
I haven't checked, discovered it on 20.11.2 and only verified the newer drivers in case AMD fixed it in the meantime
Posted on Reply
#124
xantippe666
Divide OverflowSomeone figured out that running a driver for incorrect hardware can be an issue.
Ha, you won the Internet for today, sir. :D
Posted on Reply
#125
Vayra86
z1n0xMaybe they can't find good graphics driver engineers in US/CA, they do not exactly grow on trees.
Mhm yeah they also totally havent been making and hopefully nurturing a driver team since forever, right?

This is inexcusable for such a big company. The only reason is what we have always suspected: lack of talent is cheaper. Its how they have kept both AMD and RTG afloat with minimal expense.

Its a trend, not an occurrence. And AMD condones the shit quality code. Driver and microcode oopsies happen all the time. Fix forward seems to be the approach and overall strategy. It is for that reason also that such anomalies in code exists. The impossible was forced to possible with dirty tricks. If you roll back and fix, you dont need those.

www.linkedin.com/pulse/service-recovery-rolling-back-vs-forward-fixing-mohamed-el-geish
Posted on Reply
Add your own comment
Apr 24th, 2024 09:22 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts