• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Confirms Issues Cropping Up With Turing-based Cards, "It's Not a Broad Issue"

2080Ti are more susceptible to failure than cheaper cards because they're more complex. As I said above, though, that still doesn't excuse poor QC.
I don't think that it's the complexity of the platform that's at issue here. GPUs have been complex for years.
QC may not be entirely to blame either. Quality Control needs time to properly test in all possible scenarios, tossing variables into the mix and observing the results. This takes a lot of time.

With this happening with the 2080s and 2080Ti cards, I'm wondering if they tested as thoroughly as they ~could~ have.
Companies are getting into a rush to market mentality akin to throwing products on the wall to see what sticks. They're more than happy to fix the screw-ups, but this comes at the expense of customer inconvenience. (and possible downtime)

People are getting pissed off that for such a huge outlay of money, there are any issues at all. There shouldn't be. Lots of folks sell off their existing premium hardware to partially fund buying the shiny, new stuff that we (think we) want. Then, we have nothing to fall back to when unnecessary crap like this crops up.

I'm glad that I didn't buy into 20 series cards yet.
 
I still call that poor QC, even if it isn't the quality department's fault because the big wigs are rushing shit out the door. $120 part or $1200 part doesn't matter, bad is bad... that is if the issue really is as widespread as some make it to be.
 
"crashes, black screens, blue screen of death issues, artifacts" Yep got all of these plus lockups with driver crashes this was on a Asus 2080 OC, the card cooked it self on idle the fans never turned on only when the card was underload, the backplate i could of cooked eggs on, i returned it and i wont be getting other one.

Someone who has enough dollars to afford a RTX 2080, yet not enough sense to know about the "fan off in idle" feature that has existed on graphics cards for years. Amazing.
 
Not even gonna question why the card was running so hot while idle, just call the guy stupid? Cool. I think it's stupid the card was allowed to get so hot while idle in the first place, "feature" or not. Should be some kind of failsafe there.
 
"It's not a broad Issue"


But this is,.....
broad issue.jpg
 
I remember a similar response when they 'forgot' to tell their marketing team that the 970 had only 3.5 GB of useful VRAM.

'Its not a big issue, we forgot some insignificant detail, move on pls'

We all know what came next. That said, this is not the same kind of problem of course, but it underlines why you can use a bag of salt when reading this article.

I remember what came next ....a lot of fear mongering and alarmism but reputable web sites were never able to reproduce the issue w/o doing some really weird things. Sure you could create an issue but it's one of those "well if i do this list of things in a certain sequence, i can cause an issue, especially at resolutions and settings which are inappropriate for the card(s) in question. And if the 3.5 GB was the problem, why did the 980 which has the full 4 GB exhibited the same behavior when exposed to the same sequences ? NVidia screwed the proverbial pooch froma PR PoV but that's where the issue ended.

https://www.guru3d.com/news-story/m...rdor-geforce-gtx-970-vram-stress-test,12.html

Thing is, the quantifying fact is that nobody really has massive issues, dozens and dozens of media have tested the card with in-depth reviews like the ones here on my site. Replicating the stutters and stuff you see in some of the video's, well to date I have not been able to reproduce them unless you do crazy stuff, and I've been on this all weekend.

Let me clearly state this, the GTX 970 is not an Ultra HD card, it has never been marketed as such and we never recommended even a GTX 980 for Ultra HD gaming either. So if you start looking at that resolution and zoom in, then of course you are bound to run into performance issues, but so does the GTX 980. These cards are still too weak for such a resolution combined with proper image quality settings. Remember, Ultra HD = 4x 1080P. Let me quote myself from my GTX 970 conclusions “it is a little beast for Full HD and WHQD gaming combined with the best image quality settings”, and within that context I really think it is valid to stick to a maximum of 2560x1440 as 1080P and 1440P are is the real domain for these cards. Face it, if you planned to game at Ultra HD, you would not buy a GeForce GTX 970.

As far as 2xxx series problems, don't know if it's a fixable issue or not... to early to tell. But here's the deal ... if you want to be a "beta tester" and pay a premium to be the 1st one on your block with th enew shiny thing, then take the punches which will come when you make that choice. It happens with early steppings of CPUs, MoBos, GFX cards, SSDs ... you name it. The folks we built for rarely if ever got hammered by any of these because we have always recommended not investing in hardware that isn't a few steppings into production

a) pre B3 stepping P68 boards .. Intel chipset fail affected all brands ... industry wide recall
b) pre C1 stepping Asus ToG boards (external devices don't wake up from sleep. Asus said tough noogies
c) EVGA 970 early SC boards ... 1/3 of HD missed GPU ... EVGA said, yeah we designed it that way.
d) EVGA early SC / FTW 1060 - 1080 boards ... were going up in smoke because of missing thermal pads. Did the right thing gave owners thermal kitss that required 90 minutes of their time and effort to install
e) MSI tape adhesive on 900 series... users sometimes damaged fans taking off cause adhesive was too strong. Replaced the cards but owners still had he RMA hassle.

There's a reason it's called the bleeding edge ... early adopters have to expect to take a few punchers and will bleed a bit. Most manufacturers address them religiously ... some (i.e, Asus RoG line) often abandon their users (System time freeze bug, audio bugs, sleep bug for example) where they promise an upcoming fix that never arrives. I understand these folks frustration and wouldn't want to be there. The paid good money ... better said they overpaid because they were anxious and they deserved to get a functioning product. No doubt the vendors will make good with RMA replacements if a fix is not forthcoming. But they could have avoided this by making wiser choices. Show patience, loose the need to be the 1st on the block to impress ya friends and wait a bit. ..

a) You'll pay less
b) You will get a product in which bugs in early steppings will be history
c) Your likelihood of having to make repeated TS calls and deal with RMA is significantly less
d) You will likely see performance improvements as more mature production lines have better yields

I am a bit concerned as many users are sitting and waiting, putting off new builds till the 9xxx series CPUs, Z390 boards and RTX cards weed out their bugaboos. Normally, they'd be cutting loose right after the holidays. But on jan 1, us US folks will see tariffs tripe to 30% on electronics aso there's gonna be a huge crush on vendors to keep up with supply after the holidays.
 
Issues can and will happen no matter the price point ( of course not ideal but hey we don't live in a perfect world ) the most important is how the company having those issues does handle them . As long as NVIDIA provides a decent service for those having issues there is no point to make a fuss about it.
I agree. There is always going to be a failure rate to any electronic item. Nvidia always stands by their products so they're not going to give you a hard time to anyone who has to RMA a product because they're not happy with it or it's not working properly. If Nvidia was like Apple denying they have problems at all it would be a different story. Third-party vendors are also usually good when there are any problems. I went with a Asus 2080ti card this time around but probably should have went with EVGA since I know that they never gave me a problem and always have good customer service.
 
I remember what came next ....a lot of fear mongering and alarmism but reputable web sites were never able to reproduce the issue w/o doing some really weird things. Sure you could create an issue but it's one of those "well if i do this list of things in a certain sequence, i can cause an issue, especially at resolutions and settings which are inappropriate for the card(s) in question. And if the 3.5 GB was the problem, why did the 980 which has the full 4 GB exhibited the same behavior when exposed to the same sequences ? NVidia screwed the proverbial pooch froma PR PoV but that's where the issue ended.

https://www.guru3d.com/news-story/m...rdor-geforce-gtx-970-vram-stress-test,12.html



As far as 2xxx series problems, don't know if it's a fixable issue or not... to early to tell. But here's the deal ... if you want to be a "beta tester" and pay a premium to be the 1st one on your block with th enew shiny thing, then take the punches which will come when you make that choice. It happens with early steppings of CPUs, MoBos, GFX cards, SSDs ... you name it. The folks we built for rarely if ever got hammered by any of these because we have always recommended not investing in hardware that isn't a few steppings into production

a) pre B3 stepping P68 boards .. Intel chipset fail affected all brands ... industry wide recall
b) pre C1 stepping Asus ToG boards (external devices don't wake up from sleep. Asus said tough noogies
c) EVGA 970 early SC boards ... 1/3 of HD missed GPU ... EVGA said, yeah we designed it that way.
d) EVGA early SC / FTW 1060 - 1080 boards ... were going up in smoke because of missing thermal pads. Did the right thing gave owners thermal kitss that required 90 minutes of their time and effort to install
e) MSI tape adhesive on 900 series... users sometimes damaged fans taking off cause adhesive was too strong. Replaced the cards but owners still had he RMA hassle.

There's a reason it's called the bleeding edge ... early adopters have to expect to take a few punchers and will bleed a bit. Most manufacturers address them religiously ... some (i.e, Asus RoG line) often abandon their users (System time freeze bug, audio bugs, sleep bug for example) where they promise an upcoming fix that never arrives. I understand these folks frustration and wouldn't want to be there. The paid good money ... better said they overpaid because they were anxious and they deserved to get a functioning product. No doubt the vendors will make good with RMA replacements if a fix is not forthcoming. But they could have avoided this by making wiser choices. Show patience, loose the need to be the 1st on the block to impress ya friends and wait a bit. ..

a) You'll pay less
b) You will get a product in which bugs in early steppings will be history
c) Your likelihood of having to make repeated TS calls and deal with RMA is significantly less
d) You will likely see performance improvements as more mature production lines have better yields

I am a bit concerned as many users are sitting and waiting, putting off new builds till the 9xxx series CPUs, Z390 boards and RTX cards weed out their bugaboos. Normally, they'd be cutting loose right after the holidays. But on jan 1, us US folks will see tariffs tripe to 30% on electronics aso there's gonna be a huge crush on vendors to keep up with supply after the holidays.

In fact the rabbit hole was a little bit deeper than that.

Far Cry had visible stutter on 970 and several driver updates were needed to fix that. The stutter was not appearing on any other Maxwell card. Nvidia had to mitigate the effects of the memory setup, obviously, but that needed some tweaking. In SLI, the 970 is also more prone to stuttering than other 'full fat' solutions like the 980 or the 980ti. Something's gotta give, and we are now in a period of time where 4GB is the norm rather than the high end. These GPUs get obsolete faster. This is why everyone today will be seen recommending a 980 but not a 970 - the latter simply won't cut it anymore and the large price gap between the two has all but vanished.

Regardless, the point was about trust and business ethics and how that relates to this Nvidia statement. Not the end performance of the specific part. And in that aspect Nvidia took a fall with the 970, and rightly so. It was misleading advertising, we thought we got a full fat 256 bit 4GB, and we did not. Countering that with 'but performance was OK' is the weirdest kind of argumentation ever. If we lose a few GB/s on a new to be released 1060, we dó complain and worry about its impact on performance (check the recent announcement topic on the GDDR5X version of it). And there is an impact, simple enough. Numbers don't lie. Whether or not a driver can mitigate or 'hide' that impact is another discussion entirely, you're still not magically getting those GB/s back.
 
Not even gonna question why the card was running so hot while idle, just call the guy stupid? Cool. I think it's stupid the card was allowed to get so hot while idle in the first place, "feature" or not. Should be some kind of failsafe there.

And whose stupidity is the overheating: NVIDIA's, or Asus's? I'mma give you a clue: who manufactured the card? And is this thread about an issue facing that company (in which case the post belongs here), or not (in which case it's just meaningless FUD obfuscating the actual issue)?
 
And whose stupidity is the overheating: NVIDIA's, or Asus's? I'mma give you a clue: who manufactured the card? And is this thread about an issue facing that company (in which case the post belongs here), or not (in which case it's just meaningless FUD obfuscating the actual issue)?
Whoever loaded it up with that "fans off when idle" feature that evidently doesn't care if the card is cooking... cause idle! I'm guessing that's an oversight over at ASUS. I don't think that's a standard feature.
 
People with deep pocket should listen and follow more people with lower budget.
Because they more careful look what they buy, read more, estimate, search, wait to product show negative sides, etc...
But some people hurry like fly on sheet, hypnotized with advertising and this that someone will say WOW if they buy everything new.
Sometimes WOW could become Laugh, example now when even perfect product can't justify such prices.
Now I'm more jealous on someone why pay GTX1080Ti for lower price than people who spend 1500$ on premium RTX2080Ti as Galaxy, K|NGP|N etc...
 
It's a weird situation, but I don't think people will be screwed over the long term. NVIDIA has a good record of fixing their screw-ups, if indeed this is even a screw-up at all.
A wait and see is in order (and being glad that I decided that these new cards were too expensive for me) and they're probably well on the way to a resolution.
 
https://www.gamersnexus.net/industr...2080ti-investigation-1080ti-stock-almost-gone

Talking to all of our board partner contacts off-record, none of them have reported higher RMA requests than normally. We trust our contacts on these and spoke with nearly everyone in the market. The most common RMA reasons haven’t changed from previous generations, and actual RMA rate is exceptionally low right now. Some board partners are at under 0.01%, which is just because these devices are so new that no one has even had a chance to encounter serious problems yet.

The problem seems to me somewhat concentrated to FE RTX 2080 Ti models, but there have been users with Gigabyte and ASUS cards who have come forward with similar issues.

Speaking with two of our SI contacts, we heard similar responses: Neither company has seen abnormal RMAs for these devices.

...

Thus far, our reddit thread has garnered about 5 dead 2080 Ti samples, at time of this video going up.

As I expected: FUD, pure and simple.
 
In fact the rabbit hole was a little bit deeper than that.

Far Cry had visible stutter on 970 and several driver updates were needed to fix that. The stutter was not appearing on any other Maxwell card. Nvidia had to mitigate the effects of the memory setup, obviously, but that needed some tweaking. In SLI, the 970 is also more prone to stuttering than other 'full fat' solutions like the 980 or the 980ti. Something's gotta give, and we are now in a period of time where 4GB is the norm rather than the high end. These GPUs get obsolete faster. This is why everyone today will be seen recommending a 980 but not a 970 - the latter simply won't cut it anymore and the large price gap between the two has all but vanished.

Regardless, the point was about trust and business ethics and how that relates to this Nvidia statement. Not the end performance of the specific part. And in that aspect Nvidia took a fall with the 970, and rightly so. It was misleading advertising, we thought we got a full fat 256 bit 4GB, and we did not. Countering that with 'but performance was OK' is the weirdest kind of argumentation ever. If we lose a few GB/s on a new to be released 1060, we dó complain and worry about its impact on performance (check the recent announcement topic on the GDDR5X version of it). And there is an impact, simple enough. Numbers don't lie. Whether or not a driver can mitigate or 'hide' that impact is another discussion entirely, you're still not magically getting those GB/s back.

I had no issues with any 970s on any games including Far Cry ... have two SLI 970s boxes here and many more builds, single and SLI with no reported problems. I do agree that nVidia screwed the pooch with the PR and the way they handled it, the problem was the performance was too darn close to the 980 and they needed some way to nerf it. Doesn't appear they thought that one thru. It was misleading but the fact was, any problem you could make on the 970, you could also make with the 980.

Too many folks just don't understand what their utilities are capable of.... they download a utility recommended by someone on the internet and use it without understanding what does. And yes the fact remaoins we still have folks screaming that more VRAM is needed solely because their utility is misinforming them. As Mr. Inigo Montoya said so well in Princess Bride "I don't think that word means what you think it means".


And no, there is no utility in existence that measures VRAM usage ....

https://www.extremetech.com/gaming/...y-x-faces-off-with-nvidias-gtx-980-ti-titan-x

I was once offered as proof the TPU results for the 1060 3GB and 6GB models ... "see it's 6% faster". But the reason the 6 GB is faster is because it has 11% more shaders. So it is 6% faster at 1080p... and if the reason was VRAM, then we should have seen the gap widen at 1440p but it doesn't ... same 6%.

Again, I am not excusing nVidia's PR response to the problem ... er issue ... but the fact remains, any problem you can create o the 980, you could create on the 980. But we have seen this kind of response many times ... EVGA on the 970 SC where their excuse for 1/3 the heat sink missing the GPU was intentional. Asrock for bulging caps and broken boards, Intel for the giant P68 pre-B3 recall. Today it's become a modus operandi and no one does it better than the "alternative facts" crowd. But one side shoveling the stuff does not excuse the other side from the same behavior. The excuses were BS but so was the imaginary problem ... had their been a real problem, like P68 B3, there would have been a recall.
 
Der8auer comments on the ongoing event.


tl;dr pretty much what the link I posted said. Real RMA numbers from real distributors show zero evidence of excessive RMA rates for RTX SKUs. As der8auer says, journalists - including TPU - should be careful of parroting rumours for the sake of clicks; it's basic journalistic responsibility to check facts before you decide to publish.
 
tl;dr pretty much what the link I posted said. Real RMA numbers from real distributors show zero evidence of excessive RMA rates for RTX SKUs. As der8auer says, journalists - including TPU - should be careful of parroting rumours for the sake of clicks; it's basic journalistic responsibility to check facts before you decide to publish.
I wouldn't trust rumors, just yet, but there's an obvious conflict of interest here.

Right let's ask Nvidia, oh wait :wtf:
 
Saw this on nVidia Forums ... no clue as to where it came from as poster didn't list

"I have info which seems to confirm for now, defective batches starts on - 0323xxx, those that have long and healthy working so far - 0333xxx. "

Of course it was followed by I have an 0344 and :) ... Does seem that the issue is primarily with FE cards. I expect it will take a week or two to nail down the problem and address it ... assuming of course that it is not batch related.
 
Oh great.

TechSpot - Researchers show Nvidia GPUs can be vulnerable to side channel attacks

UCR said:
We extend this attack to track user activities as they interact with a website or type characters on a keyboard. We can accurately track re-rendering events on GPU and measure the timing of keystrokes as they type characters in a textbox (e.g., a password box), making it possible to carry out keystroke timing analysis to infer the characters being typed by the user. A second attack uses a CUDA spy to infer the internal structure of a neural network application from the Rodinia benchmark, demonstrating that these attacks are also dangerous on the cloud. We believe that this class of attacks represents a substantial new threat targeting sensitive GPU-accelerated computational (e.g. deep neural networks) and graphics (e.g. web browsers) workloads.



Looks like GamersNexus made progress

GN said:
Testing the first defective 2080 Ti right now. We have been able to replicate some issues reported thus far, like flickering and "random" crashes to desktop. We have a lot more work to do. Have not replicated the artifacts yet.

DrX5Jx1WsAA8wqh.jpg


GN said:
As of today, we have successfully reproduced 2 modes of failure on the RTX 20-series cards that were sent in.
 
Last edited:
Video touches on a few issues that is more or less mainly driver and monitor compatibility issues especially with G-Sync or high refresh rate monitors that cause BSODs. BSODs have also been easily reproducible on multi monitor setups...

There are hardware issues present but this is how far Steve has got during their tests so far. This video is just part one. Im guessing part two is taking a closer look at the dead/defective cards with the help of buildzoid and buildzoid is a God when it comes to teardowns right to the component level.
 
Back
Top