• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Do I Need to RMA My Card?

Joined
Apr 29, 2011
Messages
14 (0.00/day)
Location
Las Vegas, NV
Hello. This is an issue I've been troubleshooting for a couple months now. I have in game lag but my FPS never drops according to FPS monitors. In multiple games, my frames are not smooth and the game freezes for multiple seconds. It doesn't crash most of the time, although it did last night. This started 3 months ago after moving. I assumed it was an internet issue at first since I changed ISPs. After some technical support, I was told I should RMA my GPU. I want to make sure the GPU is the actual issue and not another piece of hardware/weird driver thing.

Here is some system/error info:
AMD Ryzen 5 3600X
MSI RTX 2080 SUPER VENTUS XS
2x8GB DDR4 3600
ASUS TUF GAMING X570-PLUS (WI-FI)

Temps on everything is fine, checked with HWInfo, HWMonitor, Afterburner

I'll pastebin 3 sets of DXDiag errors. There are three sets due to the issue coming back after reinstalling Windows.
Highlights are: LiveKernel 141, LiveKernel 144, RADAR_PRE_LEAK_64,

Things I've done:
Reinstalled Windows multiple times
Multiple DDU uninstalls with NVcleaninstall of current drivers with only essentials
MemTest86 Ram multiple times, passed fine (haven't tried with single ram sticks at a time, but passes when both are installed)
Ran Memdiagnostic, passed
Multiple runs of UserBenchmark with no crash https://www.userbenchmark.com/UserRun/42161578
Multiple runs of 3DMark with no crash
Multiple runs of Furmark with no crash
Multiple runs of Heaven with no crash
Tried both PCIe slots

In WoW, I sometimes freeze during fights for 1-3 seconds.
In Apex, frames are not smooth the whole game and sometimes, like last night, I crash out of the game and have to restart to make it work again.

I haven't tried lowering my core clock or memory clock yet. I heard enabling debug in NVIDIA Control Panel does this. Was going to check that box and try it.

Any ideas? Hoping I don't have to RMA and be without my PC for work for possibly 2 months.
 
Short of trying it in another PC, yeah RMA the card
 
Yeah, I really wish I had access to another PC or another GPU to test. Covid + moving to a new state kinda botched that though.
 
1. Update your motherboard BIOS, new BIOS is supposed to fix the USB issue with X570 chipset that is perhaps what you are experiencing.
2. Disconnect all cables include power and sata cable and reconnect them. Make sure you are using 2 separate PCIe power cables to your GPU instead of 1
3. Maybe try -50mhz core clock offset first, if the FPS stutter is still there maybe you need to change the TIM on the GPU. Research on Youtube how to change TIM on you video card.
 
I have updated BIOS and tried the last 3 versions just in case.
I've disconnected and reconnected all cables, including trying different ports on my modular power supply.
All tests on have had good GPU temp, stable GPU usage/core clock/memory clock, power, and voltage (in Afterburner and HWInfo at least).

Using debug mode, which I've been told removes the factory OC, does seem to make things smoother. I have only just started testing it though.
By TIM do you mean time out or thermal paste? The former, I've already tried increasing the tdr to 8 like many forums suggest. Didn't help. Thermal paste I haven't touched since my temps are typically 68-70 on the gpu under max load.
 
I have updated BIOS and tried the last 3 versions just in case.
I've disconnected and reconnected all cables, including trying different ports on my modular power supply.
All tests on have had good GPU temp, stable GPU usage/core clock/memory clock, power, and voltage (in Afterburner and HWInfo at least).

Using debug mode, which I've been told removes the factory OC, does seem to make things smoother. I have only just started testing it though.
By TIM do you mean time out or thermal paste? The former, I've already tried increasing the tdr to 8 like many forums suggest. Didn't help. Thermal paste I haven't touched since my temps are typically 68-70 on the gpu under max load.

What is your PSU?
Also you can try going into BIOS and disable AMD Cool'n'Quiet

You can check in HWinfo for Memory Junction temperature and Hotspot temperature
Usually Hotspot temp should be 10-20C higher than GPU temperature, if it is much higher then the Thermal Paste is not covering the whole GPU die, causing a part of the silicon to overheat.
 
Last edited:
PSU is a Corsair RM 750x.
I'll check on coolnquiet.

During these past 2 months HWInfo has shown a max of 15c higher in hotspot. So when my gpu hits around 75c max in benchmarking, hot spot is around 90c. I bumped my fans up a while ago so that it doesn't get any higher than that.

I have reverted back to a January release of GPU drivers + a slightly older beta bios. It seems to have helped a bit. I'm getting a slightly higher boost clock sometimes too. Requires more testing.
 
So I was able to borrow a friends 3070 FE. DDU fresh driver install. I've tried most up to date and old drivers dating all the way back to December. I also uninstalled the Windows 10 update that Nvidia and Microsoft said was causing issues. My Frametimes are still all over the place causing issues just like my 2080 super.
 

Attachments

  • frametime.png
    frametime.png
    20.6 KB · Views: 91
Well I would think the SSD might be the cause of all this. Do you have another drive that you can install win10 on and test on that drive instead?
 
Right now I have Win10 on an M.2, games on a separate SSD. I guess I can reinstall windows on each one, and only use 1 at a time.

I saw in another thread that some people had luck with unplugging their front USB IO. I tried that - unplugged USB 2.0 header, 3.0 header, and HD Audio and now 4 of my fans won't turn on. Real weird. I didn't think my PSU was an issue since my GPU voltages seem correct and steady.
 
It could be the RAM overclock is unstable. I was having some strange issues that went away after dropping down to 3533 MHz.
 
That'd be weird. This issue only started a couple months ago, with the same components that had been working fine for a year.

Well I would think the SSD might be the cause of all this. Do you have another drive that you can install win10 on and test on that drive instead?
Looks like the was the final piece of the puzzle. RMA'd my card since it was having multiple display errors, changed a power adapter, and the final stutter was one of the SSD's. Good call. Thank you.
 
So no sign of any fps drop in any software but still lagging? Doesn't seem like a GPU problem to me.
 
Yeah same issue as the very beginning, no fps drop but frametime issue that shows as in game stutter. Do MMOs have spikey frametimes?

As I have gone deeper into this over the last 2 months. it really looks like it is a driver thing since the 30 series came out. There are SO many people with these issues.

The latest reinstall of windows I did, after removing the problematic ssd, did have a hiccup at first. When I first rebooted after replacing the psu cable and removing the ssd, it didn’t boot and had a mobo light for CPU. I restarted and it had a mobo light for RAM. I restarted with a single stick and it started fine. I switched sticks and it restarted fine again. Put both sticks in and it restarted fine. So that was confusing.

Testing it some more in Apex today.
Consistent GPU temp, Usage, VRAM, core/memory clock, power, CPU use, RAM used, Framerate, and frametime.

But consistent stutter when moving the camera around/quickly and even some 1 second frame freezes. I even died turning to shoot someone, froze for a second and then just dead. All the while, all the stuff monitored above was consistent.

That's why I feel like it has to be a driver thing? Seems like a lot of people are having issues with nvidia drivers.

That shouldn't be a mobo/ram/PSU thing, right?
 
Last edited:
Back
Top