• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

An interesting AMD GPU issue.

Status
Not open for further replies.

Atlas39

New Member
Joined
Apr 10, 2024
Messages
10 (0.02/day)
Hello everyone. First of all, please forgive my poor English, I will try to explain the problem I have experienced in detail.

I'm experiencing a black screen followed by a restart issue while playing games and conducting GPU stress tests.
The issue doesn't occur every time, and it's unpredictable; it happens randomly.

For example, when playing Cyberpunk 2077 with the CPU at stock settings, specifically at 3600 MHz, I encounter a black screen and restart issue. However, when I enable CPU Core Performance Boost, I can play for hours without any problems.

Sometimes when I overclock the CPU, the problem is temporarily resolved. And sometimes, when I overclock the VRAM, the problem is also temporarily resolved, but it starts again after a while.
In some cases, when I set the GPU fan speed to 100%, the problem is temporarily resolved, but then it starts again later.

I'm sure this is not a heating or PSU-related issue because during stress tests, the CPU reaches a maximum of 75 degrees Celsius, the graphics card reaches a maximum of 65-70 degrees Celsius (hotspot 85-90 degrees Celsius), VRAM temperature does not exceed 70 degrees Celsius, and socket temperature stays between 60-65 degrees Celsius

I also tried with three different PSUs:

  • Gigabyte P650B
  • Xigmatek X-Power 650W
  • Deepcool PK660D
When I changed the PSU, the problem was temporarily resolved for a few days, but then it started again.

Sometimes when I install pro drivers, the problem is temporarily resolved, but it starts again after a while. Also, when I run an OCCT power test for an hour, there are no issues, but the next day when I run the power test again, the computer shuts down within 1 minute.

I believe this is a software or stability-related issue. I've tried all AMD drivers, and although the problem is sometimes temporarily resolved, it starts again later.

My prebuilt PC Specs:

Processor: AMD Ryzen 7 3700x
GPU: Sapphire Nitro+ RX 6600 XT
RAM: 32 gb 3200 MHZ ( idk brand )
Motherboard: Asus B450m Dragon
SSD: 1 TB Kioxia exceria g2
HDD: 2 TB Seagate 7200 rpm
PSU: 550w Deepcool PK550D Bronze


The solutions I've tried:

* DDU & every driver version
* installed only driver option
* completely disassembled the computer and reassembled it.
* replaced thermal paste & pads
* tried different rams
* removed XMP (d.o.c.p)
* Reset CMOS
* undervolt & underclock gpu

I learned that there is no VRM heatsink on the motherboard, but I can run CPU stress tests for hours without any issues. The problem only occurs when I run GPU 3D tests.
 
Event viewer might give you some hints. After a crash, check event viewer and see if there's any critical logged
 
I had terrible issues (different from yours) after switching to 4K gaming.
They were completely fixed by just enabling paging file in Windows settings.

I broke my head searching for reason in my RX 6800 Nitro+
 
What OS are you using & how old is the install?
 
Only way to rule out software at this point is with a clean windows installation, only install the drivers and everything else necessary and test one of these games again.
 
Event viewer might give you some hints. After a crash, check event viewer and see if there's any critical logged

when i check even viewer there is only kernel power 41 (63)

Did you check if your RAM is crashing your system yet?

Why do you think this is a GPU failure and not some other component (like RAM) ?

yes i tried with different ram and also cleaned sockets and sticks still same
and when i test with my very old gpu ( gtx 1050 ti ) stress tests and games works flawlessly
I had terrible issues (different from yours) after switching to 4K gaming.
They were completely fixed by just enabling paging file in Windows settings.

I broke my head searching for reason in my RX 6800 Nitro+

i will try that too today i hope it will fix mine too

What OS are you using & how old is the install?

i tested both windows 10 and 11 pc is 2 years old

Only way to rule out software at this point is with a clean windows installation, only install the drivers and everything else necessary and test one of these games again.

i did many times even i turned off automatic driver installation still same
 
Are you undervolting your GPU? Every single crash i've had on my 7900 XTX has been because I pushed my uv too far
 
i did many times even i turned off automatic driver installation still same
Then it's probably the GPU. When you say "the problem is temporarily fixed by doing X" that doesn't actually mean anything, it just means the problem is still there but it happens intermittently.
 
3 red flags I see:

1: 32GB of randoram. It's probably not stable.

2: Three low quality power supplies. I would bet the ripple is pretty bad, possibly bad enough to cause instabilities.

3: No mention of case setup and air flow. If the PSU is mounted incorrectly it will eventually overheat and make the ripple even worse.
 
Are you using Display Port?

I had to get a higher grade DP cable back on the RTX 3070 for the same issue, it would randomly black screen at 170hz at 1440P.
I ended up buying 8K certified DP cables and it's been absolutely solid since and now with my RX 7900 XT too.
 
Any .dmp files in C:\Windows\LiveKernelReports ? If yes, archive and upload them here
 
get a new SSD, anything 128GB or larger will do.
make a microsoft win 10 or 11 usb stick.
remove ssd, install new ssd, install windows.
same problems?
 
2: Three low quality power supplies. I would bet the ripple is pretty bad, possibly bad enough to cause instabilities.

3: No mention of case setup and air flow. If the PSU is mounted incorrectly it will eventually overheat and make the ripple even worse.
This forum is oddly obsessed with blaming PSUs every time, the guy said he tried 3 different PSUs, this is a 6600XT we're talking about, his entire system probably doesn't even hit 300W spikes. It's very obvious at this point that the PSU cannot be at fault here.
 
This forum is oddly obsessed with blaming PSUs every time, the guy said he tried 3 different PSUs, this is a 6600XT we're talking about, his entire system probably doesn't even hit 300W spikes. It's very obvious at this point that the PSU cannot be at fault here.
It's not the power draw, it's the ripple.
 
This forum is oddly obsessed with blaming PSUs every time, the guy said he tried 3 different PSUs, this is a 6600XT we're talking about, his entire system probably doesn't even hit 300W spikes. It's very obvious at this point that the PSU cannot be at fault here.
Because transient spikes under load are real and garbage PSUs cant handle it. Don't look at the watts on the label, read the rated peak and sustained amps.
 
Because transient spikes under load are real and garbage PSUs cant handle it. Don't look at the watts on the label, read the rated peak and sustained amps.
I pointed out that his system wont peak at even half of what these PSUs are rated for, do you really think it's probable that 3 different PSU, even if they are garbage, they all couldn't even take half the transient spikes their meant to be able to handle ?
 
I pointed out that his system wont peak at even half of what these PSUs are rated for, do you really think it's probable that 3 different PSU, even if they are garbage, they all couldn't even take half the transient spikes their meant to be able to handle ?
Again it's not a capacity issue or transient issue when using junker PSUs on performance parts, it's a ripple issue. It's especially a problem when these parts are turbo'd to the edge. ATX spec is quite liberal, so something on the fringe will fail to meet the clock cycle.

Basic troubleshooting 101 that they teach you in ET school is start with the output (in this case described by the OP as an eventual crash), then the input (PSU, known junkers), then half split from there. Once we get passed the certification of clean power, then we can move on to the other components in the chain.

I'm curious how the actual PSUs are being mounted. I've seen people starving PSUs of fresh cool air, and it's a huge problem if the PSU is only rated for 40C or less. The capacitors begin to overheat over time, causing the ripple to become that much worse as it is not filtered. The symptom (output failing after some time) points to this or a memory system instability, but we need to verify power first.
 
it's a ripple issue.
You think it's a ripple issue based on nothing as far as I can tell, on three different PSUs that he said he used, no less.

Whatever, I gave my opinion, another thread about a pretty straight forward GPU problem turned into "buy another PSU bro".
 
i doubt it is the deepcool PSU, deepcool is a “true” chinese brand…
(made by Beijing Deepcool industries Co. ltd )

so they are not a garbage PSU manufactured in China for the US… but the gigabyte p650g is made in Vietnam, but is a Taiwan company… but, p650g is a gold rated PSU…

perhaps only buy gold rated PSU’s (that cuts down alot of “bad” PSU’s…
 
Are you undervolting your GPU? Every single crash i've had on my 7900 XTX has been because I pushed my uv too far
sometimes i do because its crash more frequent at stock settings

Then it's probably the GPU. When you say "the problem is temporarily fixed by doing X" that doesn't actually mean anything, it just means the problem is still there but it happens intermittently.
yes i think so maybe its a faulty or defective gpu

3 red flags I see:

1: 32GB of randoram. It's probably not stable.

2: Three low quality power supplies. I would bet the ripple is pretty bad, possibly bad enough to cause instabilities.

3: No mention of case setup and air flow. If the PSU is mounted incorrectly it will eventually overheat and make the ripple even worse.
today i checked brands of rams its g.skill ripjaws my knowledge not enough about brands so idk if its good or bad

today i borrow a corsair rm850 80+ gold from a friend and its crashes again but when i put my own psu it didn't crashed but i know it will start crash again in 1-2 days

my case is msi mag vampiric 2 fans front 2 fans top and one exhaust room temperatures are between 17-20 Celsius humidity %35 - %45 when its under stress and not crashing i always check psu exhaust and i can feel the very cool airflow

Are you using Display Port?

I had to get a higher grade DP cable back on the RTX 3070 for the same issue, it would randomly black screen at 170hz at 1440P.
I ended up buying 8K certified DP cables and it's been absolutely solid since and now with my RX 7900 XT too.
i have both hdmi 2.1 and displayport 1.4 both of them tested and still crashing
Any .dmp files in C:\Windows\LiveKernelReports ? If yes, archive and upload them here

there is no bsod or .dmp only kernel power 41 63

get a new SSD, anything 128GB or larger will do.
make a microsoft win 10 or 11 usb stick.
remove ssd, install new ssd, install windows.
same problems?

yes i tried but problem still

--------------------------------------------------------------------

Things I suspect:

* I noticed that these problems occur more frequently on rainy and windy days This situation raises suspicion on the electricity distribution company in my area.
But I use a surge protected socket and I do not have any problems when I use items at home that require more power than the computer.
Since I don't know much about electricity, I can't prove that it could be caused by this.

* sometimes overclocking vram or cpu resolves problems until next day or session so maybe windows itself might messes stability

* whenever i reinstall drivers with ddu problem resolves until i turn off / on pc


Conclusion:

I'm really tired of dealing with these problems for a long time
i wanna build a new pc now. This time i will build myself because crashing one was prebuilt

So guys I don't know much about brands so I need your help.

i want Intel-Nvidia based system this time
I can't afford high-end graphics cards like RTX 4090 due to terrible taxes.(because even xfx rx 6600 xt is around 1200 u.s. dollars here)

so i can afford a 3080 ti or 4070 or 4060 with a little bit extra work not sure which one would worth the money idk

and also i need to choose best cpu, ram kits, motherboard and psu for one of those gpu

any advices will be appreciated

thank you so much for helps & efforts guys
 
As a last ditch effort, go buy some 8K certified HDMI / DP from KableDirect before you splash out more cash than you want to.
 
@Atlas39
are the top fans set to blow in, or out?
would have them go out for a try just to have it done.

maybe look at getting even a small UPS, preferred one with AVR, so it can smooth out any fluctuation you have on incoming power,
usually will also help (psu) lifetime a bit.

are you on the latest bios?
would be worth a try.
do not use ony perf tweaks/oc on the bios settings until we know what causes the issue.
on ryzen, settings for ram can be set manually (clocks/voltage), and better than using some preset.

we need some location info, if you want proper recommendations it can help point to places where to get parts,
or at least be able to compare pricing.

i wouldnt get all new stuff all at once, switch out parts and by chance you might find the (hw) cause,
then you could build a new if you like, and redo the "old" one fore selling or as second pc for you home/family.

start with the ram, your board can do 3600, this would allow for the IF to run at optimal speed ad get you a little boost.

can you try the gigabyte/deepcool psu in a different pc, and see if it works
(leave it outside the case, just connect all cables for a day or two of testing)

next i would look for a different card from a friend to test, just to rule it out.
 
2 of my friends also experienced this issue, they were both running 6600 XT just like you, both running quality PSUs (Seasonic Focus GX, CM MWE V2). The setting which eventually fixed the issue for them was disabling ULPS through afterburner.
1. Click the settings icon in afterburner

Screenshot_1.png

2. Under the AMD compatibility properties find and enable "Disable ULPS" Setting

Screenshot_2.png

3. Restart PC
Hopefully this works for you also
EDIT: I believe you have to do this after every driver update, I have an Nvidia card so you have to verify yourself.
 
You think it's a ripple issue based on nothing as far as I can tell, on three different PSUs that he said he used, no less.

Whatever, I gave my opinion, another thread about a pretty straight forward GPU problem turned into "buy another PSU bro".
My 7900XTX was tripping my EVGA 1300w. OCP will trip if the spike is hard enough and the unit can't handle it. It's about QUALITY of the unit which is none of what OP was testing with.
 
Status
Not open for further replies.
Back
Top