• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Reports of Bricked NVIDIA GeForce RTX 5090 and RTX 5090D Surge

These aren't hardware failures. They're PCIe gen 5 compatibility issues with certain motherboards/BIOS. Der8auer covered this in his 5090 review video.

Apparently not every case is a simple PCIe incompatibility.

And Der8auer pointed out in following video that many AIB partner practically had no time to test their designs due to short timelines - which means that basically buyers are the testers. Even expensive models might be shipped with serial flaws - and reviewers usually aren't the ones that point them out, due to obvious reasons.
 
No doubt there are hundreds, maybe thousands of different motherboard SKUs with a PCIe Gen 5 slot and each board could have many BIOS updates, but how many motherboards does NVidia test before they release a GPU driver update?

Do the problems occur mostly on motherboards with an out-of-date BIOS, several years old or is it a hardware problem?

I bought my mobo in November 2022 and I'm wondering if I should pass on a PCIe Gen 5 GPU upgrade, even if I update the BIOS.
 
This is such a stupid non-argument. Of course NVIDIA tested it with PCIe 5.0 motherboards, otherwise every man and their dog would be complaining. The fact the latter is not happening should, but apparently hasn't, clue you into the fact that maybe, just maybe, these isolated reports are either user error or incompatibility issues with specific motherboards and their firmware.
Here we go again…blame the motherboard manufacturers. Intel tried this and failed. I don’t see Nvidia faring any better.

3000 Series - Teaches people backside capacitor knowledge
4000 Series - Teaches people cables, connectors, electrical resistance and contact surfaces knowledge
5000 Series - Teaches people IC burn and signal integrity ?????


Special Education GIF by GIPHY News
Love the comment!

For the life of me, I still can’t understand why Nvidia has such a cult following worse than even Apple fans.
 
Assuming it is 'just' PCIe 5.0 issues, and not component failures on the GPUs...
My question is then, if it is an issue with PCIe 5.0 on specific motherboards.

Other than PCIe 5.0 NVMEs, to the best of my knowledge, there have been no consumer PCIe 5.0 devices released, with which the motherboards could be tested.
I find it conceivable, that some motherboard designs tried to cut costs too hard, and have compromised PCIe 5.0 connectivity.
Since these are the first 5.0 GPUs using the 16x slot, perhaps a preexisting issue is just now coming to light?
Thank you for having one of the only sane posts in this thread.
 
i stand corrected, there is a God after all
 
Because only AMD makes mistakes in their drivers, Nvidia is perfect, obviously. ;)

Assuming it is 'just' PCIe 5.0 issues, and not component failures on the GPUs...
My question is then, if it is an issue with PCIe 5.0 on specific motherboards.

Other than PCIe 5.0 NVMEs, to the best of my knowledge, there have been no consumer PCIe 5.0 devices released, with which the motherboards could be tested.
I find it conceivable, that some motherboard designs tried to cut costs too hard, and have compromised PCIe 5.0 connectivity.
Since these are the first 5.0 GPUs using the 16x slot, perhaps a preexisting issue is just now coming to light?
Plausible, but why does it present itself only upon driver install, then?

By the way, it's funny that it's always x90 cards that have these problems, never below.
 
If this is a hardware issue and custom designs were custom, they wouldn't all be vulnerable to the same flaw. Would they?
 
Last edited:
By the way, it's funny that it's always x90 cards that have these problems, never below.

because they are the 1st to come out, time to fix the others
 
Because only AMD makes mistakes in their drivers, Nvidia is perfect, obviously. ;)


Plausible, but why does it present itself only upon driver install, then?

By the way, it's funny that it's always x90 cards that have these problems, never below.
A poster postulated earlier, that the GPU defaults to a slower PCIe link speed for compatibility, until the driver is loaded and initialized.
If this were the case, upon loading the driver, it could switch from a working, slower, PCIe speed to non-functional PCIe 5.0.

Unless this sets a persistent marker on the GPU itself though, display output would/should still be working fine until the OS loads the drivers.
It could then be that those having issues have Quick Start activated, and are Shutting Down instead of Restarting?

Or, perhaps, upon initializing PCIe 5.0 on certain motherboards, some component is being overloaded, and shorting out.

Without having a system fail in the hands of a reputable source, it is really hard to tell what is going on.
Hopefully it is properly investigated by nVidia, AIB, and MB manufacturers, with a timely fix being provided.
 
I guess even trillions of dollars of evaluation can't guarantee functioning products. Thank God there weren't that many of these available for purchase anyway.
 
Last edited:
I guess even trillions of dollars of evaluation can't guaranteed functioning products. Thank God there weren't that many of these available for purchase anyway.

i bet with you the AI stuff gets triple checked.
It's the gamers fault for being so cheap, the 5090 should have cost at least 10.000usd to get to QC
 
Good job shilling for arguably the scummiest $3 trillion company in the world & that's saying something considering the other one is a cult :rolleyes:
You don't become a trillion dollar company by being that stupid and willingly selling hardware that fail faster than a raptor lake CPU though. The RTX 4000 not being fool proof was one thing. You can call that negligence for not trying how that plug would behave when mistakes happens. But that? That's the kind of stuff that makes a company going bankrupt if it happens regularly. The product need to at least work. Companies that knows how to pull mindshare know that you can't afford to pill up that kind of screw up. It destroys the mindshare. Ask Samsung.

It's the kind of stuff that made them lose Apple as customers back the days (Geforce laptops would suffer a sudden death because the chip soldering was giving out. Bumpgate) They've lost billions, had to ship GPUs for free to replace those that died, and reportedly got verbally assaulted by Apple staff.
 
It's not a minor problem. With such limited stock, we're already seeing reports of dying GPUs popping up by the dozen on the internet(Reddit, forums). Imagine what would happen in a normal scenario.

Grab your torches and rakes and head for Jensen's house!
 
I could have sworn there was a post on TPU about Enterprise level 5000 cards Overheating. With the fact that only the most desperate bought these, it does not bode well if we are getting reports of Failures.
 
You don't become a trillion dollar company by being that stupid and willingly selling hardware that fail faster than a raptor lake CPU though. The RTX 4000 not being fool proof was one thing. You can call that negligence for not trying how that plug would behave when mistakes happens. But that? That's the kind of stuff that makes a company going bankrupt if it happens regularly. The product need to at least work. Companies that knows how to pull mindshare know that you can't afford to pill up that kind of screw up. It destroys the mindshare. Ask Samsung.

It's the kind of stuff that made them lose Apple as customers back the days (Geforce laptops would suffer a sudden death because the chip soldering was giving out. Bumpgate) They've lost billions, had to ship GPUs for free to replace those that died, and reportedly got verbally assaulted by Apple staff.

But I think the times are slowly becoming very different now. Large companies are slowly gaining more and more power to present the reality as they see fit, and there is very little journalism that stands in their way. Several large screw-ups were publicized just by a handful of brave YouTube influencers / reviewers that are kind of celebrities companies still don't dare to take down. For now. Old type review sites? They are usually playing it safe, wait for somebody else to report the issue, and then publish the "according to" article, if even that. And the review sites that dared to be on the side of customers? You can ask AnandTech, HardOCP and many others how that worked out...

Even the issues that get published are often brushed aside as rare cases, user errors, comments are ful of whataboutisms of fanboys that feel they have to defend their favourite megacorporation...
 
But I think the times are slowly becoming very different now. Large companies are slowly gaining more and more power to present the reality as they see fit, and there is very little journalism that stands in their way. Several large screw-ups were publicized just by a handful of brave YouTube influencers / reviewers that are kind of celebrities companies still don't dare to take down. For now. Old type review sites? They are usually playing it safe, wait for somebody else to report the issue, and then publish the "according to" article, if even that. And the review sites that dared to be on the side of customers? You can ask AnandTech, HardOCP and many others how that worked out...

Even the issues that get published are often brushed aside as rare cases, user errors, comments are ful of whataboutisms of fanboys that feel they have to defend their favourite megacorporation...
That has little to do with (size of) companies and everything to do with journalism. The less you care about your reputation, the more you can afford to publish news before it's properly verified.
If you want to bring companies into this, they are held to an even higher standard as a simple "our products seem to fail" post, without a proper root-cause analysis can earn you criminal charges, since it can be interpreted as an attempt to manipulate stock.
 
But I think the times are slowly becoming very different now. Large companies are slowly gaining more and more power to present the reality as they see fit, and there is very little journalism that stands in their way. Several large screw-ups were publicized just by a handful of brave YouTube influencers / reviewers that are kind of celebrities companies still don't dare to take down. For now. Old type review sites? They are usually playing it safe, wait for somebody else to report the issue, and then publish the "according to" article, if even that. And the review sites that dared to be on the side of customers? You can ask AnandTech, HardOCP and many others how that worked out...

Even the issues that get published are often brushed aside as rare cases, user errors, comments are ful of whataboutisms of fanboys that feel they have to defend their favourite megacorporation...
One of my posts was already hidden by the moderator even though I was just trying to say all companies have problems. Many sites must be worried about showing that their is no real difference between the different manufacturers. It’s all hype and even worse with W1zzard blaming other companies for not having Nvidia proprietary technologies (i.e. DLSS).

There is a lot of protectionism around Nvidia so it’s hard to get good verifiable information whether it’s true or false.
 
If you want to bring companies into this, they are held to an even higher standard as a simple "our products seem to fail" post, without a proper root-cause analysis can earn you criminal charges, since it can be interpreted as an attempt to manipulate stock.

Can't imagine that bring the danger when Intel denied their CPUs we're failing, blamed users, even denied RMAs from reviewers, and had to be practically bullied after several months of denial into confession - and even then they brushed it aside as very rare instances.
 
Can't imagine that bring the danger when Intel denied their CPUs we're failing, blamed users, even denied RMAs from reviewers, and had to be practically bullied after several months of denial into confession - and even then they brushed it aside as very rare instances.
At least Patty Cakes was fired for that and probably other things. The big leather jacket will probably get the medal of freedom.
 
That has little to do with (size of) companies and everything to do with journalism. The less you care about your reputation, the more you can afford to publish news before it's properly verified.
If you want to bring companies into this, they are held to an even higher standard as a simple "our products seem to fail" post, without a proper root-cause analysis can earn you criminal charges, since it can be interpreted as an attempt to manipulate stock.
It is not even journalism in some cases. it is propaganda. We live in this Spy vs Spy world where if this thread had AMD in the title the bombastic comments would have started. What is also true is that regardless of how badly Nvidia do or how desultory they behave people will always defend them. They did have Hardware Unboxed come back early from Fly Fishing in the Indian Ocean though. They literally created a new spec to deal with the new connector that was allegedly burning in some cases.
 
Because only AMD makes mistakes in their drivers, Nvidia is perfect, obviously. ;)


Plausible, but why does it present itself only upon driver install, then?

By the way, it's funny that it's always x90 cards that have these problems, never below.
Not sure why it would show itself only upon driver install. Odd things happen.

I had a MB back during the end era of Vista (I started using Vista about a year before 7 released) when I upgraded to a new CPU/MB. I had multiple GPUs to test with (I think a GTX 280 and a 8800 GTS 512MB, but not 100% if those were the cards I had at the time....it's been a long while so memory might be off) when I had issues with a new MB.
Got everything setup and Vista installed - booted into Windows without a hitch.
Installed all the driver updates from the disc that came with the MB - no issues.
Installed the GPU driver and the system completely locked up. Windows would freeze most of the time or it would BSOD.
Maybe a GPU driver issue? Tried multiple WHQL drivers - same problem.
Maybe a GPU issue? Swapped GPUs, tried multiple drivers - same problem.
Maybe an OS issue? Wiped the drive, installed XP - no issue. Install GPU driver and the system would lock up. Tried the other GPU and multiple drivers - same problem.

I don't know what the problem was, but when I installed any driver for my cards the whole thing would lock up or occasionally BSOD. I contacted ASRock about it, their tech team asked me a slew of questions and offered suggestions and nothing worked. They said RMA. Got the replacement MB and the problem was no more.

This could be a driver issue or a MB issue with the 5090 having problems. Guess we just wait and see. Too bad for those that have a broken card, they're the ones that really have to suffer. Hopefully they have a backup GPU they can use for now.
 
Last edited:
Assuming it is 'just' PCIe 5.0 issues, and not component failures on the GPUs...
My question is then, if it is an issue with PCIe 5.0 on specific motherboards.

Other than PCIe 5.0 NVMEs, to the best of my knowledge, there have been no consumer PCIe 5.0 devices released, with which the motherboards could be tested.
I find it conceivable, that some motherboard designs tried to cut costs too hard, and have compromised PCIe 5.0 connectivity.
Since these are the first 5.0 GPUs using the 16x slot, perhaps a preexisting issue is just now coming to light?

These are my thoughts exactly, particularly on lower cost motherboards. Tolerances on Gen 5 spec are very strict. I see cheaper B650E, X670E and Z690 motherboards possibly faltering here. Though probably nothing some timing adjustments won't fix, either BIOS or VBIOS side.

Not sure why it would show itself only upon driver install. Odd things happen.

I had a MB back during the end era of Vista (I started using Vista about a year before 7 released) when I upgraded to a new CPU/MB. I had multiple GPUs to test with (I think a GTX 280 and a 8000 GTS 512MB, but not 100% if those were the cards I had at the time....it's been a long while so memory might be off) when I had issues with a new MB.
Got everything setup and Vista installed - booted into Windows without a hitch.
Installed all the driver updates from the disc that came with the MB - no issues.
Installed the GPU driver and the system completely locked up. Windows would freeze most of the time or it would BSOD.
Maybe a GPU driver issue? Tried multiple WHQL drivers - same problem.
Maybe a GPU issue? Swapped GPUs, tried multiple drivers - same problem.
Maybe an OS issue? Wiped the drive, installed XP - no issue. Install GPU driver and the system would lock up. Tried the other GPU and multiple drivers - same problem.

I don't know what the problem was, but when I installed any driver for my cards the whole thing would lock up or occasionally BSOD. I contacted ASRock about it, their tech team asked me a slew of questions and offered suggestions and nothing worked. They said RMA. Got the replacement MB and the problem was no more.

This could be a driver issue or a MB issue with the 5090 having problems. Guess we just wait and see. Too bad for those that have a broken card, they're the ones that really have to suffer. Hopefully they have a backup GPU they can use for now.

As someone mentioned in the thread it's because of native power management and the fact that the PCIe link speed won't max out without the driver installed.

No Gen 5 GPU products even existed up until now, so on field deployment it's bound to run into issues. Gen 5 PCIe testing equipment in labs is one thing, real world usage likely another beast altogether.
 
Last edited:
Back
Top