• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Bug effecting all Nvidia GPUs - Nvidia won't respond - we need your help!

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
So, I've found a bug, I've reported it to nvidia, and it landed with a helpdesk noob who didn't understand it, and is now stuck in his queue as he's gotten butthurt and refuses to look at it. Yes, seriously.

We're going to need the community, to force nvidia to pay attention to this intentionally 'lost' case. Sadly, the community seems not to actually care..... How about techpowerup? Little help?

Here's a link to an illustration of the bug in effect and steps to (visibly) reproduce it yourself:
*EDIT: There is one very important detail missing from this link: The card performing the tests attached, is a 3090. This is relevant when combined with the suggested resolution to observe the issue, because the intention is to produce an extreme framerate >250FPS, so if you do this test with a different card a lower resolution will be required.

You can create your own case with nvidia, or if you like you can tell them to look at mine. Same name as here, they can find it.

Thanks in advance for your help.
 
Last edited:

rtwjunkie

PC Gaming Enthusiast
Supporter
Joined
Jul 25, 2008
Messages
13,848 (2.91/day)
Location
Louisiana -Laissez les bons temps rouler!
System Name Bayou Phantom
Processor Core i7-8700k 4.4Ghz @ 1.18v
Motherboard ASRock Z390 Phantom Gaming 6
Cooling All air: 2x140mm Fractal exhaust; 3x 140mm Cougar Intake; Enermax T40F Black CPU cooler
Memory 2x 16GB Mushkin Redline DDR-4 3200
Video Card(s) EVGA RTX 2080 Ti Xc
Storage 1x 500 MX500 SSD; 2x 6TB WD Black; 1x 4TB WD Black; 1x400GB VelRptr; 1x 4TB WD Blue storage (eSATA)
Display(s) HP 27q 27" IPS @ 2560 x 1440
Case Fractal Design Define R4 Black w/Titanium front -windowed
Audio Device(s) Soundblaster Z
Power Supply Seasonic X-850
Mouse Coolermaster Sentinel III (large palm grip!)
Keyboard Logitech G610 Orion mechanical (Cherry Brown switches)
Software Windows 10 Pro 64-bit (Start10 & Fences 3.0 installed)
Sorry man, I’m not having any trouble and I definitely don’t see legions of people here complaining about mysterious Nvidia problems you don’t truly identify.

If you are too lazy to identify the problem in writing then I’m too lazy to decipher your image.
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
You get a better response if you post this in the Reddit/Nvidia forums.
Went to the nvidia forums after two weeks of no response from tech support, no response there, went to reddit a week later, posts deleted, a week later I'm here.

Sorry man, I’m not having any trouble and I definitely don’t see legions of people here complaining about mysterious Nvidia problems you don’t truly identify.

If you are too lazy to identify the problem in writing then I’m too lazy to decipher your image.
Follow the link by clicking the image. The one I labelled "Here's a link to an illustration of the bug in effect and steps to (visibly) reproduce it yourself: "
 
Last edited:
Joined
Nov 8, 2020
Messages
214 (0.82/day)
System Name Dusty
Processor 8700k - delid - 5Ghz
Motherboard ASUS Maximus X Hero
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
Went to the nvidia forums after two weeks of no response from tech support, no response there, went to reddit a week later, posts deleted, a week later I'm here.


Follow the link by clicking the image. The one I labelled "Here's a link to an illustration of the bug in effect and steps to (visibly) reproduce it yourself: "

I did, and I found no problems at all. On two cards, the 3070 and the 1050ti in my laptop.
Though as it mentions, it only occurs in some monitoring software so the question is then, is the issue rather related to their implementation being less than optimal than a horrifying bug?
Either way, no problems here.

The picture itself is pretty much worthless either way without better resolution on the scale, there are always variances in framerate and frametimes. But I found no variances that occur regularly, as they would in that case.
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,082 (4.90/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
The picture itself is pretty much worthless either way without better resolution on the scale, there are always variances in framerate and frametimes. But I found no variances that occur regularly, as they would in that case.
And polling the GPU will cause data to be sent over the PCI-E bus, which can cause a very minor frametime spike. Sometimes this can't be avoided and it's usually so small it won't be noticeable(I know I've never noticed it).
 

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
5,623 (1.11/day)
Location
Nelson B.C. Canada
System Name Blacknet
Processor E5-1680v2 Xeon
Motherboard Asus P9X79 Pro
Cooling Noctua NH-D14/7case fans
Memory 32gb Gskill 1866 Cas9
Video Card(s) EVGA FTW GTX 980 Ti ACX 2.0+
Storage Toshiba 3TB, x300 Toshiba 5TB, 2x EVO 850 250GB, 2x EVO 860 500GB, LG 14x Blu-Ray Rewriter
Display(s) 24" LG 24GL600F 144HZ, 23" Asus VZ239H IPS
Case Antec 1200
Audio Device(s) Asus Xonar MKII+ AKG Q701 Studio Monitors
Power Supply XFX XTR 750 Gold
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky One 2 RGB
Software Kubuntu 21.04
Benchmark Scores It's linux baby!
I'll call this close to flamebate, but maybe he really believes it. I wouldn't put much credence in this.
 

Solaris17

Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
22,341 (3.84/day)
Location
Florida
System Name Venslar
Processor I9 7980XE
Motherboard MSI x299 Tomahawk Arctic
Cooling EK Custom
Memory 32GB Corsair DDR4 3000mhz
Video Card(s) Nvidia Titan RTX
Storage 1x 250GB 960 EVO | 1x 500gb Intel 720p | 32TB SAN
Display(s) 3x AOC Q2577PWQ (2k IPS)
Case Inwin 303 White (Thermaltake Ring 120mm Purple accent)
Audio Device(s) Schiit Fulla 3 on Beyerdynamic DT 990 Pros
Power Supply Seasonic 1050W Snow
Mouse M55 RGB PRO (White)
Keyboard Ducky Shine 6 Snow White
VR HMD HTC VIVE
Software Windows 10 x64 Pro
And polling the GPU will cause data to be sent over the PCI-E bus

further polling most things increases load in some way. try spamming the shit out of like a thermistor on an I2C bus.

measure 0 or the temperature of the sun.
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
Sorry man, I’m not having any trouble and I definitely don’t see legions of people here complaining about mysterious Nvidia problems you don’t truly identify.

If you are too lazy to identify the problem in writing then I’m too lazy to decipher your image.
Regarding this: It's not something you'll notice unless you're seriously digging deep to tune your performance, or if you are running really strange loads in really unusual ways (see: the process to reproduce the bug. Who plays games at 900p on a 3090? These kind of strange conditions are what's required to make this visible to the naked eye). Default settings such as pre-rendering queue depths will ensure that this bug is hidden from view, but it is still impacting your performance. It's just that instead of stuttering in a visual way you see on-screen or in a graph, it stutters elsewhere in your system, like, keyboard inputs or network traffic or something fun.

This is why even though it's effecting literally every card that's been tested, I'm the only one (as far as I know) who's noticed it. It's not obvious, to put it lightly. At least, not under normal conditions. I personally noticed it because I was messing with some frame synchronisation that required millisecond accurate extremely low frametimes with a single frame pre-render. I've given instructions that will reproduce it reliably in a way that's easy to see on a frametime plot.


I did, and I found no problems at all. On two cards, the 3070 and the 1050ti in my laptop.
Though as it mentions, it only occurs in some monitoring software so the question is then, is the issue rather related to their implementation being less than optimal than a horrifying bug?
Either way, no problems here.

The picture itself is pretty much worthless either way without better resolution on the scale, there are always variances in framerate and frametimes. But I found no variances that occur regularly, as they would in that case.
You're the first of 20 PC's not to see any issue, but the rest of your post makes me wonder if your test platform is valid. You say "there are alwys variances in frametimes" but take a look at my graph on the right. As explained in the text there, I used a frametime limiter to accentuate this, maybe you will want to also, but it isn't needed to observe this fault (you'll need a sharper eye though) and of course this demonstration assumes you can maintain stable frametimes in the first place, obviously we can't test a frametime-related issue otherwise.

The graph I've shown is more than enough to illustrate the issue even with that resolution - because the issue is so blatantly apparent. I can grab you higher res images if you like though.

The question regarding the monitoring apps is valid. I can see in traces that a specific Nvidia API call is the one taking an exceedingly long time, and because this is not unique to a specific app, I'm going upstream to the first common point. If there's a faulty API implementation then nvidia will want to issue an advisory to developers as such.
And polling the GPU will cause data to be sent over the PCI-E bus, which can cause a very minor frametime spike. Sometimes this can't be avoided and it's usually so small it won't be noticeable(I know I've never noticed it).
Which this isn't, as traces will show. The Nvidia techs will get all that, just as soon as they actually look at this.

I'll call this close to flamebate, but maybe he really believes it. I wouldn't put much credence in this.
Test it as described and you will believe it too. You really think I've spent the past month having people call me a liar because they wouldn't even look, for my benefit? The only person getting flamed over this, is me.

Edit: Your signature applies here.
further polling most things increases load in some way. try spamming the shit out of like a thermistor on an I2C bus.
I can slow polling to every 10 seonds and it will still spike. I can copy every frame down the PCI buss and back up again every time and not generate enough load to even reach 1/10th of this spike. This isn't excessive buss traffic or normal behaviour when polling.
 
Last edited:
Joined
Nov 8, 2020
Messages
214 (0.82/day)
System Name Dusty
Processor 8700k - delid - 5Ghz
Motherboard ASUS Maximus X Hero
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
frametime.PNG


Hwinfo running and monitoring everything, horrible frametimes for sure.

Spike was a printscreen which I later realized was only for heaven so I had to snip it.
Smaller variances are of no concern considering all the stuff I got running in the background but would you look at that. No regular issues at all.

Point is, the problem might exist but i get the feeling your overstating its severity.
 
Joined
Nov 11, 2016
Messages
1,677 (0.98/day)
System Name The de-ploughminator
Processor I7 9900K @ 5.1Ghz
Motherboard Gigabyte Z370 Gaming 5
Cooling Custom Watercooling
Memory 4x8GB G.Skill Trident Neo 3600mhz 15-15-15-30
Video Card(s) RTX 3090 + Bitspower WB
Storage Plextor 512GB nvme SSD
Display(s) LG OLED CX48"
Case Lian Li 011D Dynamic
Audio Device(s) Creative AE-5
Power Supply Corsair RM1000
Mouse Razor Viper Ultimate
Keyboard Corsair K75
Software Win10
Easy solution to easy problem, just set max FPS, solves 99% of all frametime issues.
When the GPU pipeline is getting 100% hammered, any polling will cause slight stutter, even when moving your mouse. Nvidia already knew about this, that's why they created Reflex API, which basically limit the GPU pipeline at 98% load, leave the last 2% for mouse input latency reduction or hardware polling.
Other solution is using "Prefer Maximum performance" in the Power Management Mode in NVCP, which keep high GPU clocks so that GPU pipeline is free.
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
View attachment 204532

Hwinfo running and monitoring everything, horrible frametimes for sure.

Spike was a printscreen which I later realized was only for heaven so I had to snip it.
Smaller variances are of no concern considering all the stuff I got running in the background but would you look at that. No regular issues at all.

Point is, the problem might exist but i get the feeling your overstating its severity.
If you want to reproduce the fault in a manner you can easily view in a frametime graph, please follow my instructions. If you follow some other process, as you have, I can't guarantee that it will work.

Easy solution to easy problem, just set max FPS, solves 99% of all frametime issues.
When the GPU pipeline is getting 100% hammered, any polling will cause slight stutter, even when moving your mouse. Nvidia already knew about this, that's why they created Reflex API, which basically limit the GPU pipeline at 98% load, leave the last 2% for mouse input latency reduction or hardware polling.
Other solution is using "Prefer Maximum performance" in the Power Management Mode in NVCP, which keep high GPU clocks so that GPU pipeline is free.
The frametimes are just a symptom of the issue. The aim here is not to achieve stable frametimes, it is to fix the driver. I don't have any desire to sweep this under the rug.

The GPU pipeline is not getting 100% hammered. In my process you will find it at 17%. Polling the GPU does not cause stutter in other scenarios. This isn't a utilisation issue.
 
Joined
Nov 8, 2020
Messages
214 (0.82/day)
System Name Dusty
Processor 8700k - delid - 5Ghz
Motherboard ASUS Maximus X Hero
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
If you want to reproduce the fault in a manner you can easily view in a frametime graph, please follow my instructions. If you follow some other process, as you have, I can't guarantee that it will work.
I did follow your process but at this point im starting to think it won't matter what anyone does because you will refuse to believe any of it.
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
Guys, please. I've been trying to fix YOUR GPU for the past month. Do me a favour: If you don't want to perform my test, don't. If you don't want to contact nvidia, don't. But please, pretty please, I am SO tired of explaining the same things over and over. I've been through all of this on several forums now and it's the same every time... If you're not going to do my test as instructed, and if you're not a developer who would understand it anyway, and if you're not willing to call nvidia regrdless.... please, just step away. That's all I ask of you. Thanks.

Edit: Don't get me wrong, I'm down to spend all day explaining it to people who want to understand, I just have zero inclination toward arguments.
I did follow your process but at this point im starting to think it won't matter what anyone does because you will refuse to believe any of it.
No, you didn't. You prove it in your screenshot.
 
Last edited:

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
22,491 (3.58/day)
Processor Core i7-8700K
Memory 32 GB
Video Card(s) RTX 3080
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Gpuz not affected?
 
Joined
Nov 8, 2020
Messages
214 (0.82/day)
System Name Dusty
Processor 8700k - delid - 5Ghz
Motherboard ASUS Maximus X Hero
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
No, you didn't. You prove it in your screenshot.
Yes, I did.
First you complain I didnt limit it even though your post says that would not be needed but makes it easier. Even then I had no issues.
So then I did it again, this time with a limiter as you suggested and thats what I got, and thats still apparently not correct.
A nice smooth flat frametime graph, even when running hwinfo which as you suggested would show frametime spikes

As you say in the post yourself
Here, Heaven is configured to run lowest settings and 1600x900 resolution, to ensure a very high frame rate/low frame times, which will ensure that the fault is easily visible in the graph. The fault continues at any resolution or framerate. In this case, I have applied a frametime limiter, in order to get a flat line which will accentuate the frametime spike from the glitch. You do not need to use a frametime limiter, but it will make the bug more apparent in the graph.

Which is exactly what I did with in this case Hwinfo64 running on all sensors which according to you, should cause these regular spikes in frametimes.

Otherwise its you who have not explained it properly because I see no issues at all running Hwinfo64 while I run Heaven.

Else, please confirm that hwinfo64 + Heaven + Limiter should not produce spikes? Because your post says otherwise claiming that no matter what framerate or what resolution the spikes will occur.
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
No, you didn't. You prove it in your screenshot.
F*** my apologies there is a minor omission in that doc (I've pasted the wrong draft like an idiot) that has a major impact. I mentioned the resolution but not that these test were done on a 3090 (I mention it earlier in the thread but not on that page). If your card is weaker than that and since the process doesn't specify it like I thought, it's possible you followed it otherwise. Mea culpa. I blew it.

I did however mention you're going to need extremely high frame rates and 120 ain't that. Try to double or triple it by reducing the resolution.
Otherwise its you who have not explained it properly

It's exactly that, I am sorry. I've had SO many people fail to follow the process (usually followed by abusing me which is great fun) I tarred you with their brush. Jerk move. My bad.

Else, please confirm that hwinfo64 + Heaven + Limiter should not produce spikes? Because your post says otherwise claiming that no matter what framerate or what resolution the spikes will occur.
The delays will occur but you may not see them. Since I've wasted a ton of your time I owe you at least a proper explanation. I'll try and keep it plain-english-y.

What happens here is that the monitoring call which should be done in microseconds, takes several milliseconds. This is CPU time, not GPU time. FWIW, an earlier experiment showed me that this is extremely memory-speed critical. Taking my memory down from the usual 3800 flat 16s to 2133 with stock timings, made this issue extremely drastic and noticeable. Delays in the memory pipeline appear as delays in the cpu pipeline at a higher level of monitoring (because it's the CPU that's waiting on the data from RAM). So, given that at this point, the frame is being rendered by the CPU in order to take it's place at the end of a queue behind two other frames which have to be processed and displayed before the one that was delayed, that delay is eaten up by the buffer and you don't see it - BUT IT STILL ATE YOUR CPU IN THERE. IT JUST HID THE EVIDENCE. < Caps because this is super important otherwise it would be a non-issue, right?

So, we have the need to force a CPU-limited scenario, in order to see the CPU's behaviour. And we want to avoid a pre-rendering queue hiding the mess, right? So how? Frames, all of the frames. By reaching a massive framerate we have ensured the GPU is lightly loaded (or it wouldnt be able to get those frames - we're not trying to inducee a GPU-limited scenario here, so not too high!) and will attempt to render the frames at full tilt, thus loading up the CPU by emptying the prerender queue quickly, the loaded CPU exacerbating the issue, and the short queue exposing it.

SOOOO you need like 250+ FPS to see it. 300+ is recommended. The more, the better - so long as the system can realistically handle that load. This is why I went to the trouble of specifying 1600x900 in heaven, because at that res, with the 3090, you'll just be able to see it. So, aside from the fact I'd typed that stuff like a dozen times and didn't realise that this time I forgot to mention what card it was like an idiot..... Now you understand why nobody sees it. It's buried and hidden by mechanisms that are supposed to do exactly that, to give us smooth frametimes and high framerates. And this is why I'm being a stickler about following the process (which I screwed up and I apologise again) because if one does not (which sadly appears to be almost everyone almost all of the time) then you very easily end up in a scenario where you are not within the parameters where the bug is visible to you, for example by running 120FPS where your frametimes are too high to see it and too high to get the CPU mad, and you could probably easily run too low res and choke your system entirely and not see anything.

BTW, nvidias got the right info including the card type and much, much more than I've shared here, so that's not why they can't see it.
 
Joined
Nov 8, 2020
Messages
214 (0.82/day)
System Name Dusty
Processor 8700k - delid - 5Ghz
Motherboard ASUS Maximus X Hero
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
That makes a lot more sense and you should have started with that information!

my 8700k is however unable to push that framerate in Heaven so I can't look in to it at those framerates. What CPU did you use when you tested this? Or other hardware in general for the system?
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
Adding to the above because this has come up before: Not EVERY load makes this happen. I don't know why, I'd like to ask nvidia. I can tell you a real-world load that does: Battlefield 1. That's the game that made me notice this. But trying some other 300FPS load won't lead to "your bug doesn't exist" because I chose heaven because I know it's repeatable there. Some other load may not be. Even using BF1 as an example, it causes the problem, but the frametimes are too unstable (other than the menus but there's a reason I didn't suggest that) so it's really not useful. Myself and a handful of friends have tested it across a bunch of systems and it always works. That's why I'm specifying heaven, because it's repeatable. I did try other loads (mostly benchmarks because this is for reproduction at the lab and they might not have <insert game here>) and heaven was the best one.
That makes a lot more sense and you should have started with that information!

my 8700k is however unable to push that framerate in Heaven so I can't look in to it at those framerates. What CPU did you use when you tested this? Or other hardware in general for the system?
I should have! I thought I did! I'm honestly so sorry man. I typed it so so many times having my posts deleted and trying different places, I just thought it was in there like usual, and clearly, it isn't. I blew it.

Yeh again it's a hard bug to reproduce for many reasons and yeh one is because the hardware requirements are rough. I did manage to get a 1070+5820k to do it but that thing is tuned to the nines (it's my old gaming rig). A mate did it on a 2070 (sorry I don't know what CPU it was I wanna say 9900k)... So other GPUs can do it. CPU is probably the tricky one, because it's a matter of getting it to be loaded but not too loaded (as described above). Your 8700 probably can hit some lower-than-250 framerate that will successfully expose the spikes but there's a fair amount of work in finding that (not-so-)sweet spot. This one is a 5900x with chart-topping benchmarks pushing 3800 16-16-16-34 ram and the elusive 3090 and even still it took me months to pin it down.

It really is hard to spot. Honestly that's part of the reason it's a concerning bug, because it's the kind of thing that gets missed, and stays in the drivers forever making a tiny but entirely unnecessary dent in performance. You need such high end hardware to be able see it, that it's super easy to hide it under the performance of the thing; or you don't have that hardware and you don't ever see it.... This bug is trying hard to last forever.

Awaiting approval before being displayed publicly.
Uhh what?

1624084553571.png


Since it's been suggested a higher resolution might be useful, here it is: 100ms sample rate, vertical scale set to 16.7ms aka 60FPS. This means every horizontal grey line is 1.6ms. Heaven is set to free roam mode so the frametimes should be more stable than usual heaven, however I'm doing it with my browser, discord, etc open so there are a few spikes. The first section is just standing in heaven. Then I alt-tab out and start hwinfo64. Look at those spikes. That's dipping from 450+ FPS to 120. Then I exit hwinfo. Nice and flat again (except for those two spikes, that's discord doing something and beeping at me. I don't think this requires I re-do it, after all it's pretty obvious it's not the same as that middle section.)

Gpuz not affected?
Awaiting approval before being displayed publicly.
Normally I'd assume it's because I'm new and it was automated and I would have to wait for an admin to see it, but since you've been by already and this has been public already, I'm wondering if the thread was hidden manually?
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
22,491 (3.58/day)
Processor Core i7-8700K
Memory 32 GB
Video Card(s) RTX 3080
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
What happens here is that the monitoring call which should be done in microseconds, takes several milliseconds. This is CPU time, not GPU time.
It's probably waiting for some kind of lock, not uncommon, especially if the I2C bus is involved. Only NVIDIA can fix it, maybe they already have the fix ready and are just waiting for verification, or the right driver release window. Or they are too busy with higher priority issues

I'm wondering if the thread was hidden manually?
You made some changes to your post which triggered the spam detection for new users, so the thread went to an "approval queue"
 
Joined
Feb 20, 2020
Messages
2,301 (4.39/day)
Location
Texas
System Name Ghetto Rigs x299 & z490 & x99 & Q9550 Old timer
Processor 9940x w/Optimus SigV2 & 10900k w/Optimus Foundation & 5930k w/EK Magnitude & Q9550 w/EK Evo
Motherboard X299 Rampage VI Apex & z490 Maximus XII Apex & x99 Sabertooth & Acer WG43M
Cooling D5 combo/280 GTX/ VRM water block copper/280 GTX/ D5 Top/Optimus sigV2/TitanXp/Mora 360x2
Memory Trident-Z 3600C16 4x8gb & Trident-Z 3600c16 2x8gb & Trident-Z 3200c14 4x8gb & 4x2gb crucial
Video Card(s) Titan Xp & 1080ti ftw3 & evga 980ti gaming & Onboard graphic's need another gpu
Storage 970 evo plus 500gb & 970 evo 500gb many 2.5" ssd's and WD BLK hdd's
Display(s) 1-AOC G2460PG 24"G-Sync 144Hz/ 2nd 1-ASUS VG248QE 24"/ 3rd LG 43" series
Case D450 second floor for 2nd rad x2/ Cherry Entertainment center/ 2 Test benches
Audio Device(s) Built in Realtek x2 with Insignia 2.0 sound bars & 1-LG sound bar
Power Supply EVGA 1200P2 & 1000P2 with APC AX1500 & 850P2 with CyberPower-GX1325U & 1-750P2-extra
Mouse Redragon Perdition x3
Keyboard G910 & G710+x2
Software Win-7 pro x3 and Linux Cinnamon 20.1x2 & win-10 pro x3
Hi,
Is this threads title actuate all gpu's effected or is this just 30 series effected ?
op didn't list all gpu's tested only mentioned 3090.
 
Joined
Feb 3, 2017
Messages
3,040 (1.86/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) Geforce RTX 3070 FE
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
From the description and details - is this a GPU problem or an API/driver problem? Seems to be a latter, maybe. This is admitted to be a CPU-limited scenario, CPU is causing the bump and RAM speed greatly affects things. Also, regular checks that do cause CPU load for monitoring purposes are by themselves unavoidable, 100ms frequency is rather intensive as well.

Configuration of monitoring software surely plays into this as well if this is CPU load. Did you disable any other meters from monitoring and it still happens? Are you graphing on screen?
You mention that this does not happen with all monitoring software, you name Libre Hardware Monitor as one that does not. Is that really the case?
By the way, is the same thing replicable in some other OS, Linux for example?
 

xcasxcursex

New Member
Joined
Jun 19, 2021
Messages
23 (0.59/day)
Hi,
Is this threads title actuate all gpu's effected or is this just 30 series effected ?
op didn't list all gpu's tested only mentioned 3090.
It does it on all cards but because it's difficult to observe the faster cards are a lot easier to see it. As above, CPU actually ends up being important, too, since the GPU could just have the resolution lowered to reach high framerates, but the CPU may not be able to keep up.

From the description and details - is this a GPU problem or an API/driver problem? Seems to be a latter, maybe. This is admitted to be a CPU-limited scenario, CPU is causing the bump and RAM speed greatly affects things. Also, regular checks that do cause CPU load for monitoring purposes are by themselves unavoidable, 100ms frequency is rather intensive as well.

Configuration of monitoring software surely plays into this as well if this is CPU load. Did you disable any other meters from monitoring and it still happens? Are you graphing on screen?
You mention that this does not happen with all monitoring software, you name Libre Hardware Monitor as one that does not. Is that really the case?
By the way, is the same thing replicable in some other OS, Linux for example?
Well analysed, yes, this is an NVAPI issue as best I can tell. It's tough because the apps which are effected are closed source so there's a limit to what I can see. There's inevitably a point of this where my only answers are "I don't know, and I'd like to ask nvidia". You're right, the 100ms is extreme, I only do that to record the images for the purpose of proving this is a thing. Normally it's at default 1000. Note that this is MSI afterburner's poll rate, but the app which is causing the spikes is hwinfo, and the poll rate on that is 1 second (as visible by the giant spike every 1 second in the graphs). I actually tested it at 2, 5 and 10 seconds to see if the spikes disappeared. They didn't change a bit, other than being every 2, 5, or 10 seconds instead of every 1. I still have plenty of CPU, GPU and memory bandwidth available.... And even with other apps polling at far higher rates, there are no issues. It really doesn't suggest any kind of excessive load is to blame here. It does seem like there's some kind of scheduling/handling issue, as wizard said, it's probably waiting for a lock.... Honestly if I dig into the traces far enough I might even be able to get that specific, but that kind of work is way into the "that's nvidia's job" territory ;)

It's probably waiting for some kind of lock, not uncommon, especially if the I2C bus is involved. Only NVIDIA can fix it, maybe they already have the fix ready and are just waiting for verification, or the right driver release window. Or they are too busy with higher priority issues


You made some changes to your post which triggered the spam detection for new users, so the thread went to an "approval queue"
Thanks man I thought I got shadowbanned right from the drop, appreciate your explaining what not to do next time haha :)

Sadly, the response from nvidia after some weeks of explaining all that has been said above and much, much more, was the following:


So I tested an in-house PC that has a Win10 X64 +RTX 3080. Ran multiple games at 1080P @1440P without any issues.

I was getting good FPS as well.

I found no reason to test any 3rd party benchmark tests since all games were running perfectly. We use benchmark tools if the PC or GPU has performance issues during the normal usage or while gaming. So it indicates that your's is a singular case and there is a possibility that it's a hardware issue.

If you've read the above, you already understand why his test methodology was entirely inadequate and his conclusions entirely illogical. But the consequence of his inability to cope with this, is that we all are trapped in helpdesk limbo. He would reply but never actually do anything related to the issue just treating it like a normal stuttering complaint. Now they just don't even respond for weeks.
 
Last edited:
Joined
Feb 6, 2021
Messages
614 (3.57/day)
Location
Germany
Processor AMD Ryzen 9 5950X (and a LOT of other CPUs)
Motherboard MSI B550 Tomahawk
Cooling Corsair H150i RGB Pro XT
Memory G.Skill Trident Z Neo 3600 Mhz CL16 (2x16GB)
Video Card(s) a lot... 6900XTs, 3090s,80s,70s and 60s, almost all Turing and Pascal Cards + a few older ones.
Storage Samsung 970 Evo Plus 500GB, 2x Samsung 870 QVO 4TB
Display(s) Samsung Odyssey G7 32"
Case Fractal Design Meshify S2 White TG / Noctua NF-A14 iPPC-3000RPM Fans
Audio Device(s) Bose Companion Series 2 III, Sennheiser GSP600 running through a SoundblasterX G6
Power Supply bequiet! Dark Power Pro 12 1200w Titanium
Mouse Glorious PC Gaming Race Model O, G903 Hero, G502 Hero SE, Deathadder V2, EC2-A
Keyboard ASUS ROG Strix Scope TKL DLX, Corsair K95, Logitech G815
VR HMD Oculus Rift S
i own the whole Ampere Lineup except of the new TI Cards and i have Zero Problems.
 
Top