Friday, September 25th 2020

RTX 3080 Crash to Desktop Problems Likely Connected to AIB-Designed Capacitor Choice

Igor's Lab has posted an interesting investigative article where he advances a possible reason for the recent crash to desktop problems for RTX 3080 owners. For one, Igor mentions how the launch timings were much tighter than usual, with NVIDIA AIB partners having much less time than would be adequate to prepare and thoroughly test their designs. One of the reasons this apparently happened was that NVIDIA released the compatible driver stack much later than usual for AIB partners; this meant that their actual testing and QA for produced RTX 3080 graphics cards was mostly limited to power on and voltage stability testing, other than actual gaming/graphics workload testing, which might have allowed for some less-than-stellar chip samples to be employed on some of the companies' OC products (which, with higher operating frequencies and consequent broadband frequency mixtures, hit the apparent 2 GHz frequency wall that produces the crash to desktop).

Another reason for this, according to Igor, is the actual "reference board" PG132 design, which is used as a reference, "Base Design" for partners to architecture their custom cards around. The thing here is that apparently NVIDIA's BOM left open choices in terms of power cleanup and regulation in the mounted capacitors. The Base Design features six mandatory capacitors for filtering high frequencies on the voltage rails (NVVDD and MSVDD). There are a number of choices for capacitors to be installed here, with varying levels of capability. POSCAPs (Conductive Polymer Tantalum Solid Capacitors) are generally worse than SP-CAPs (Conductive Polymer-Aluminium-Electrolytic-Capacitors) which are superseded in quality by MLCCs (Multilayer Ceramic Chip Capacitor, which have to be deployed in groups). Below is the circuitry arrangement employed below the BGA array where NVIDIA's GA-102 chip is seated, which corresponds to the central area on the back of the PCB.
In the images below, you can see how NVIDIA and it's AIBs designed this regulator circuitry (NVIDIA Founders' Edition, MSI Gaming X, ZOTAC Trinity, and ASUS TUF Gaming OC in order, from our reviews' high resolution teardowns). NVIDIA in their Founders' Edition designs uses a hybrid capacitor deployment, with four SP-CAPs and two MLCC groups of 10 individual capacitors each in the center. MSI uses a single MLCC group in the central arrangement, with five SP-CAPs guaranteeing the rest of the cleanup duties. ZOTAC went the cheapest way (which may be one of the reasons their cards are also among the cheapest), with a six POSCAP design (which are worse than MLCCs, remember). ASUS, however, designed their TUF with six MLCC arrangements - there were no savings done in this power circuitry area.

It's likely that the crash to desktop problems are related to both these issues - and this would also justify why some cards cease crashing when underclocked by 50-100 MHz, since at lower frequencies (and this will generally lead boost frequencies to stay below the 2 GHz mark) there is lesser broadband frequency mixture happening, which means POSCAP solutions can do their job - even if just barely.
Source: Igor's Lab
Add your own comment

297 Comments on RTX 3080 Crash to Desktop Problems Likely Connected to AIB-Designed Capacitor Choice

#1
The Quim Reaper
Is that a multitude of BIOS Firmware updates with down-clocking, I see Incoming...:D
Posted on Reply
#2
Vya Domus
For one, Igor mentions how the launch timings were much tighter than usual, with NVIDIA AIB partners having much less time than would be adequate to prepare and thoroughly test their designs. One of the reasons this apparently happened was that NVIDIA released the compatible driver stack much later than usual for AIB partners; this meant that their actual testing and QA for produced RTX 3080 graphics cards was mostly limited to power on and voltage stability testing, other than actual gaming/graphics workload testing, which might have allowed for some less-than-stellar chip samples to be employed on some of the companies'
Yeah, sure. I feel like this is the number one scapegoat as of late, "We just didn't get enough time". As if they needed time to know that some capacitors are better than others. Somehow they always have less and less time, I am assuming that time will become zero at some point. :laugh:

Anyway, I am still not convinced this is the culprit but regardless NVIDIA sure managed to piss of almost everyone one way or another, customers and partners, and it's all down to their choice of designing these things with monstrous power requirements. They don't really seem to care because they know people will still cater to their whims since they got most of the market share. And remember :

The more you buy the bigger the chances that they'll work.
Posted on Reply
#3
Xuper
oh this launch is mess..
Posted on Reply
#5
moproblems99
If this is actually the reason then this is unacceptable in any industry by any player. Let alone the market leader.

If they wanted to be secretive the. They should have launched founders edition only. Oh, except they would have had even less stock.
Posted on Reply
#6
LabRat 891
Gotta love EMI/RFI design oversights. From what I've read, it is the bane of every freshly college-educated EE and many a veteran EE. I bet somebody on the design teams knew that this would cause a problem and was promptly ignored after referencing datasheets claiming "It'll be fine!"
Posted on Reply
#7
Animalpak
3000 series looked already too good to be true...
Posted on Reply
#8
Amite
Don't think I will be buying any 3080s or EVDA stock anytime soon.
Of mice and men.
A big merger and a big GPU launch plus a pandemic what could go wrong ? LOL
Posted on Reply
#9
newtekie1
Semi-Retired Folder
Early adopters are beta testers these days.
Posted on Reply
#10
Sykobee
I mean, it's in the name - POSCAPS.

But it's really pretty poor that this has happened, and it's a poor rushed launch by Nvidia - when if the card was good enough to compete with RDNA2 coming soon, they could have waited a few more weeks to get it right and to get a stockpile for launch.
Posted on Reply
#11
KarymidoN
So they didn't have enough time because of what? it's not like AMD had already released BIGNAVI/RDNA2. it's not like Pascal owners were screaming for new graphics cards bc they RTX were not powerfull enough, sure Nvidia wanted to undercut the new consoles, but they don't have a competitor for the new consoles (not in price).
They rushed the Launch, botched the market claims creating hype without having enough stocked products to sell... Nvidia taking a Page out of AMD's book (see recent release of Ryzen APUs and Mobile chips that are always out of stock).
Posted on Reply
#12
john_
Funny that in the end consumers will be praising bots for avoiding this first batch of boards.
Posted on Reply
#14
windwhirl
I don't get why AIBs went cheap for this board. I mean, it's the second highest-tier GPU! You should never go cheap in that kind of product!
Posted on Reply
#15
Chomiq
If it's related to underdeveloped AIB partner designs then why FE users are also reporting CTDs? Unless everyone is overclocking their brand new ampere gpus.
Posted on Reply
#16
Assimilator
Chomiq
If it's related to underdeveloped AIB partner designs then why FE users are also reporting CTDs? Unless everyone is overclocking their brand new ampere gpus.
Too many unknowns to tell. Igor's speculation is just that, speculation - but somehow his "possible" gets turned into "likely" by TPU's clickbait editors. Once again, shameful yellow journalism on par with WCCFTech.
Posted on Reply
#17
Dave65
And people say only AMD cards have problems:shadedshu:
Posted on Reply
#18
Julhes
there will be the same problem with the rtx 3090. the arrangement of the capacitors and the type are the same.
Dave65
And people say only AMD cards have problems:shadedshu:


Oups.........
Posted on Reply
#19
Chrispy_
Yet more solid confirmation that Nvidia really rushed the whole 30-series launch.

It's uncharactaristic from Nvidia, so what do they know about RDNA2 that makes them in such a hurry to get this horse out of the gate before it's ready for prime time?
Posted on Reply
#20
Mysteoa
Vya Domus
Anyway, I am still not convinced this is the culprit but regardless NVIDIA sure managed to piss of almost everyone one way or another, customers and partners, and it's all down to their choice of designing these things with monstrous power requirements. They don't really seem to care because they know people will still cater to their whims since they got most of the market share. And remember :
Isn't it what they have always done? They are constantly trying to burn bridges, just so the issue is not their fault.
Posted on Reply
#21
_UV_
windwhirl
I don't get why AIBs went cheap for this board. I mean, it's the second highest-tier GPU! You should never go cheap in that kind of product!
Because they want money, and every cent saved in a process is money. I'll give another example: most of mid to high (not top notch) AMD platform mobo since Athlon era produced with cost savings opposite to Intel designs in one way or another, being cheaper caps or FETs, less integrated controllers such as onboard WiFi or dual LAN, cheaper sound codec, etc...
Posted on Reply
#22
nguyen
Just tried to order a Asus TUF 3090 but my local retailer told me they had 3 in stock and sold out pretty quick, next batch will be in November :D.
Posted on Reply
#23
bug
Tight launch timing, my ass. What happened to "if you don't have the time to do the work, don't release"?
Posted on Reply
#24
HugsNotDrugs
It's too bad there isn't an OEM that uses only premium parts. I'd be happy to pay for a higher quality product (rather than a higher marketing budget) if such an option existed.
Posted on Reply
Add your own comment