• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

RTX 3080 Crash to Desktop Problems Likely Connected to AIB-Designed Capacitor Choice

And yet again ... the folks who have to be the 1st one on the block to get the new shiny thing get hammered. Almost every new generation has problems, some big some small..... some affected 1 brtand, less often all in the series. Sometimes they are easy to fix (i.e. MSI's extremely aggressive adhesive on the tape holding the fans still during shipping); sometimes they are significant but fixable (i.e. EVGAs missing thermal pads on the 1xxx series); sometimes they require design changes (i.e 1/3 of EVGAs 9xx series heat sink "missing" the GPU. Sometimes these just effect one AIB design ... sometimes they are series wide like AMDs inadequate 6 pin connector on the 480.

As the saying goes .... good things come to those who wait .... if the PCBs are indeed faulty, they will be redesigned and those who choose to wait won't have to deal with a 1st stepping design issue. The alleged "cutting corners" by AIBs is simply not supported bu history... the AIB offerings, for the most part, have always outperformed the reference and FE designs. Yes, we have the deficient EVGA designs (9xx heat sink, 1xxx missing thermal pads, 2xxx Black series non "A" GPU) which didn't measure up but that's the exception rather than the rule. I have commented a few times that "what did MSI do differently that they are the only card to deliver more fps than the FE. I did note that they had one of the lowest power limits ... perhaps the problem arises when that limit is exceeded ? In any case, hopefully folks who were unable to snag one before they were sold out, will now cancel the orders, sit and wait till the problem is defined, which cards it affects and the issued addressed in later offferings
 
And yet again ... the folks who have to be the 1st one on the block to get the new shiny thing get hammered. Almost every new generation has problems, some big some small..... some affected 1 brtand, less often all in the series. Sometimes they are easy to fix (i.e. MSI's extremely aggressive adhesive on the tape holding the fans still during shipping); sometimes they are significant but fixable (i.e. EVGAs missing thermal pads on the 1xxx series); sometimes they require design changes (i.e 1/3 of EVGAs 9xx series heat sink "missing" the GPU. Sometimes these just effect one AIB design ... sometimes they are series wide like AMDs inadequate 6 pin connector on the 480.

As the saying goes .... good things come to those who wait .... if the PCBs are indeed faulty, they will be redesigned and those who choose to wait won't have to deal with a 1st stepping design issue. The alleged "cutting corners" by AIBs is simply not supported bu history... the AIB offerings, for the most part, have always outperformed the reference and FE designs. Yes, we have the deficient EVGA designs (9xx heat sink, 1xxx missing thermal pads, 2xxx Black series non "A" GPU) which didn't measure up but that's the exception rather than the rule. I have commented a few times that "what did MSI do differently that they are the only card to deliver more fps than the FE. I did note that they had one of the lowest power limits ... perhaps the problem arises when that limit is exceeded ? In any case, hopefully folks who were unable to snag one before they were sold out, will now cancel the orders, sit and wait till the problem is defined, which cards it affects and the issued addressed in later offferings

I'm pretty be sure the fix is going to be an underclock since they still perform above the promise on the box when underclocked. I don't think they would go back and retool their design unless they are going to make a new model that will cost more. Usually the easier fix when acceptable will be the path chosen. I hope they will make a better version. I would like to overclock since I find it fun to do even if it's not really needed. I think what you are saying about the power limit is the issue. Is this the highest power draw from a card to date? Or maybe it's a problem of the high power combined with the die shrink. Since from what I learned about CPUs is that on a die shrink they don't need as much power to function because it becomes more power efficient. But efficiency flew out the window here.
 
Remember what Igor said: “By the way, you also have to praise a company here that recognized the whole thing from the start and didn’t even let it touch them, as the Asus TUF RTX 3080 Gaming consequently did without POSCAPs and only used MLCC groups. My compliments, it fits!” ASUS did a fantastic job. They knew what the problem was at least they predicted right. Their quality control caught the problem and they went the best quality way at $50 more. Lesson learnt.
 
Nobody signed to be a beta tester, the advertising was "It just works!"
Saying it just works is the same as saying it barely works, exact same meaning. ;)
 
3000 series looked already too good to be true...

Oh it'll get to the point where its worthwhile, I'm not too worried, as this is too big to fail territory.

But it'll take a while, and time is on our side really. The more and longer Nvidia struggles, the more they will need to watch the AMD space. Lacking supply can also be an easy ticket to switch camps, at some point people do need a GPU even if the one available is not their first choice - which has even yet to be seen, mind.

If Nvidia needs a downclock or limit to peak clocks they're losing % against competition which might just nudge things in Navi's favor. Interesting times! I hope our resident reviewer is happy to revisit those FE's.... :D

And yet again ... the folks who have to be the 1st one on the block to get the new shiny thing get hammered. Almost every new generation has problems, some big some small..... some affected 1 brtand, less often all in the series. Sometimes they are easy to fix (i.e. MSI's extremely aggressive adhesive on the tape holding the fans still during shipping); sometimes they are significant but fixable (i.e. EVGAs missing thermal pads on the 1xxx series); sometimes they require design changes (i.e 1/3 of EVGAs 9xx series heat sink "missing" the GPU. Sometimes these just effect one AIB design ... sometimes they are series wide like AMDs inadequate 6 pin connector on the 480.

As the saying goes .... good things come to those who wait .... if the PCBs are indeed faulty, they will be redesigned and those who choose to wait won't have to deal with a 1st stepping design issue. The alleged "cutting corners" by AIBs is simply not supported bu history... the AIB offerings, for the most part, have always outperformed the reference and FE designs. Yes, we have the deficient EVGA designs (9xx heat sink, 1xxx missing thermal pads, 2xxx Black series non "A" GPU) which didn't measure up but that's the exception rather than the rule. I have commented a few times that "what did MSI do differently that they are the only card to deliver more fps than the FE. I did note that they had one of the lowest power limits ... perhaps the problem arises when that limit is exceeded ? In any case, hopefully folks who were unable to snag one before they were sold out, will now cancel the orders, sit and wait till the problem is defined, which cards it affects and the issued addressed in later offferings

You'd think folks would know better by now, but no. So its well deserved really. Early adopting is great, as long as its not me ;)

Saying it just works is the same as saying it barely works, exact same meaning. ;)

The power of emphasis in speech... haha

True. Finding the optimal balance between cost, quality and value to the end user can be a very serious challenge. Everyone wants to make money and as much as possible. In the case of video card AIBs, they want to make money but also boost their brand. Most actually care about making a quality product and hate it when things like the problems being faced currently happen.

You don't see the inside of big companies a lot do you...

I do... and yes 'they hate it'... until they get in the car and drive home. Its a 9 to 5 job, this hating of the work people have or haven't done, and the bottom line is just people screwing up and management not giving it enough mind to fix it. Or, management killing the workforce with too much work and/or too little time. The assumption everyone can do his job proper is a bad one, the assumption should be 'double check everything or it will likely go wrong'. This is what you do when you release software or code, too. You make sure there is no room for error through well defined processes - and even thén, something minor might just get through the cracks.

ALL of this is self-inflicted, conscious, well calculated risk management - even that last 1% that does get past and goes wrong. The bottom line is cost/benefit, it just doesn't always work out like people think it does. In the end, it is only and always the company producing something that is fully responsible. Nobody should ever have to find excuses for any company making mistakes. They're not mistakes. They were thoroughly looked at, and some people in suits together said 'We'lll run with this', and poof, consumer can start shoveling poop. Meanwhile, a healthy profit margin was already secured as 'the bottom line'...

Case in point here, because the only reason this is happening is because cards get pushed beyond or too close to the edge. That is directly, and only a cost/benefit scenario: performance per dollar. Even despite this capacitor detail, really, which kinda comes on top of it. The fact the line is thís thin, is telling in terms of overall product longevity, as well. That alongside with the heat of memory and several other decisions made with this 3080 really keeps me FAR away from it, so far.

It doesn't look good at all. Its a bit like cheap sports cars. Lots of HP's for not a lot of cash... but your seat is shit, the tank is empty before you've reached the end of the street and after a year you're replacing half the engine.
 
Last edited:
Late to the party, but with these multi-billion transistor GPUs, I think they get pushed a little harder than they probably should be. I bet new drivers or firmware will just dial back the boost algorithm for the sake of stability. The cards can still push to the advertised boost on an easy task, like Luxmark Mirrorball, but you will rarely see it in games.
 
I'm pretty be sure the fix is going to be an underclock since they still perform above the promise on the box when underclocked. I don't think they would go back and retool their design unless they are going to make a new model that will cost more. Usually the easier fix when acceptable will be the path chosen. I hope they will make a better version. I would like to overclock since I find it fun to do even if it's not really needed. I think what you are saying about the power limit is the issue. Is this the highest power draw from a card to date? Or maybe it's a problem of the high power combined with the die shrink. Since from what I learned about CPUs is that on a die shrink they don't need as much power to function because it becomes more power efficient. But efficiency flew out the window here.

Consider ...

AMD did both .... the immediate fix on the 480 was to cut power delivery with BIOS and driver updates, , but later on, vendors switched to 8 pin designs
EVGA did with the 970 ...1st they argued that they 'designed it that way", but later they came out with a new design
EVGA did again, with the malfunctioning 1060 - 108os ... 1st offer was a recall or thermal pad kit you could install yaself ... later all cards came with thermal pads.

This is very likely.

I think that's an automatic ... as above, AMD did the same thing with the 6 pin 480 fiasco ... but they followed with a move to 8 pin cards later on. Same with EVGA mishaps ... I just don't see everyone sitting and leaving this alone .... at the next board meeting, there will be at least one person in the room saying "we need to take thin step to distinguish ourselves above the others" ... but the reality is there will be one of those guys in every boardroom. Im still curious as to why no one was able to beat the FE fps wise .... while most of the other AIBs allowed for greater wattage limit. MSI left theird 20 watts BELOW the DE .... maybe MSI saw something no one else picked up ?
 
Early adopters are beta testers these days.

Alas that's pretty much true of any product these days and a lot of software. It's a pathetic situation and Nvidia rushed the product out to try and get a alot of hype generated and garner quick sales before Big Navi came along. All to the customers detriment.
 
No company shouts more about their work with partners, Devs and AIB.
The reference spec design they passed AIB was different to their own reference card's.
And they compressed development and testing time to near zero.
And they allowed such design variation in their development reference kit instead of both knowing that it needed specific voltage conditioning and informing AIB partners or limiting those AIB designs.

It's not all on Nvidia but they share the blame.

But if the AIB's actually tested them fully they would of hit the issue, maybe they knew about it and thought fck it.
 
FYI, all models can CTD when aprroaching or just surpassing 2GHz...

 
FYI, all models can CTD when aprroaching or just surpassing 2GHz...

At some point, people need to realize that Ampere just doesn't clock quite as well as Turing at ambient.
Also by default the 3080 are the lower bin GA102 dies.
 
Last edited:
At some point, people need to realize that Ampere just doesn't clock quite as well as Turing at ambient.
That's a big fail for the "largest generational leap", though. Reminds me of this great video:
 
That's a big fail for the "largest generational leap", though. Reminds me of this great video:

Samsung 8N is still a superior node than TSMC 12nm, which Nvidia used for Turing and beat the living crap outta Navi 7nm :D. If you think Navi is as efficient as Turing, look at laptop GPU segment where mobile Navi is almost non-existant.

Samsung 8N is fine, they seem to run cooler even with increased power consumption compare to TSMC 12nm FFN.

I expect all these CTDs would be fixed with newer driver, not like the cause of these CTD is that mysterious anyways. As for SPCAP vs MLCC, sounds like Asus did an excellent job with their TUF line, kudo to them, and I guess they can't be making 3080/3090 fast enough. I asked my local retailer and they said they won't have 3090 TUF in stock for at least 2 months ~_~.
 
Samsung 8N is still a superior node than TSMC 12nm, which Nvidia used for Turing and beat the living crap outta Navi 7nm :D. If you think Navi is as efficient as Turing, look at laptop GPU segment where mobile Navi is almost non-existant.

Samsung 8N is fine, they seem to run cooler even with increased power consumption compare to TSMC 12nm FFN.
It's a poor node no matter how you look at it, it runs cool just because the coolers are very high quality and huge. As the video points out, it was a poor choice for Nvidia, I wonder what the yields are on it.

It will also make a horrible node for any mobile GPU, I'm curious to what will Nvidia do to come with reasonable SKU for laptops, because these gobble way too much power as they are.
 
FYI, all models can CTD when aprroaching or just surpassing 2GHz...


It's almost like computer silicon gets unstable when you clock it past its limits.
Almost like this has been true since silicon has been used in computers.
Almost like overclock instability related to silicon limits has nothing to do with capacitor choice.
Almost like this is a non-issue that has been blown way out of proportion.

As for those people who will say "but some people get over 2GHz": silicon lottery.
As for those people who will say "but MUH CLOCKS NVIDIA IS RIPPING ME OFF": NVIDIA never guaranteed you'd get over 2GHz boost, NVIDIA in fact never even guaranteed you'd get anything more than the rated base or boost clocks. Nobody does.
 
It's a poor node no matter how you look at it, it runs cool just because the coolers are very high quality and huge. As the video points out, it was a poor choice for Nvidia, I wonder what the yields are on it.

It will also make a horrible node for any mobile GPU, I'm curious to what will Nvidia do to come with reasonable SKU for laptops, because these gobble way too much power as they are.
AdoredTV video did not account for the fact that Nvidia has the whole Samsung 8N capacity to themselves, they would be able to produce many more Ampere chips with Samsung 8N than they would with TSMC 7nm+. Nvidia was a late customer to TSMC 7nm, they wouldn't be able to secure much capacity.

On the subject of thermal and noise,
3080 TUF has better thermal and noise than 2080 Ti Strix
3080 Gaming X Trio the same, better than 2080 Ti Trio
3080 Zotac Trinity, same thing

So far all reviewed samples of 3080 show very good thermal and noise characteristic, the 3090 samples are hotter and louder but that is to be expected.
Ampere has around 20% higher perf/watt than Turing, yes it is a little on the low side but it is a compromise people have to accept to get a better perf/dollar, I expect any AMD GPU owner would understand this :D
 
It's a poor node no matter how you look at it, it runs cool just because the coolers are very high quality and huge. As the video points out, it was a poor choice for Nvidia, I wonder what the yields are on it.

It will also make a horrible node for any mobile GPU, I'm curious to what will Nvidia do to come with reasonable SKU for laptops, because these gobble way too much power as they are.

Samsung 8LPU will certainly be interesting to see how it matures for Nv. GA102 is still drawing >500W peaks @ ~20ms, so it's likely a number of factors including PSU (esp split rails). The transients of a 23b xtor die, esp lower bin tiers, are likely causing conniptions at board level/mb/psu. The stock boost algo will likely need to be less aggressive & max P state lowered/locked. The above linked review focuses on temps as an arbiter of stability, for some reason, not power. Perhaps if it wasn't an open air testbed...

Ampere has around 20% higher perf/watt than Turing, yes it is a little on the low side but it is a compromise people have to accept to get a better perf/dollar
Only if you drink the Tu pricing koolaid. As for Samsung 8N all to themselves - if you define it that way, I guess...
 
I'm pretty be sure the fix is going to be an under-clock since they still perform above the promise on the box when underclocked. I don't think they would go back and retool their design unless they are going to make a new model that will cost more.

Personally I think that this occasion will be a good test of how its one brand will respond to their customers.
Real and responsible brands will offer specific pack of solutions or choices to their customers.

The low-end they might simply hide their head under the sand, under-clock will be their only offering or a refund if you are a lucky one.
 
This text does not make any sense, and from now and on all of you, please use AIBS or AIB acronym at it full form so confusion to be avoided.
a) AIB to refer to 'non reference' graphics card designs.
b) An AIB supplier or an AIB partner is a company that buys the AMD (or Nvidia) Graphics Processor Unit to put on a board and then bring a complete and usable Graphics Card or AIB to market.
You must be unaware of any sort of irony
But if the AIB's actually tested them fully they would of hit the issue, maybe they knew about it and thought fck it.
The IFS are massive , but you're idea of creating blame based on maybes doesn't sit right with me.

Nvidia rushed their own Fe development, yet gave time to ,AIB.

Yeah right, it's a rushed launch, I'm sure blame will be thrown about but I'm not buying, so my concern and care levels are minimum , I have an opinion yes but I have said it, leave me out of the debate until you have something other than your opinion to discuss, because I don't give a shit what you Think,. I stated Facts.
 
Thanks for making it easier for me to give an example.
Since this is about power delivery, it has to "match" the power requirement of the normal operating bevaviour.
Since the testing utility is good, but doesn't test at the same temperature ranges an overclocked case can rise up to, we'll have to reserve ourselves to more moderate speeds than what the utility can have us believe......................... From there, I would play either with the fan curve, or voltage, or if on the cpu with LLC(you couldn't pick its temperature gradient if you didn't log everything up until here), but basically I find it more exciting to bust cards using this method than to use them daily, lol.

You are welcome, this is the old pack of OC - hacking a VGA how-to.
Lets return to today and latest edge of GPU architecture.
RTX 3080 due it high pricing this is now considered as investment.
NVIDIA did use additional tricks to protect it work (product) so to minimize the fail rate, it is extremely costly to handle an 1000 Euro worth of VGA card about return to base for an exchange.
I would not be impressed if the people later on will discover that even BIOS_Flash at those cards this is locked by password.

I wrote too much in this topic, now I will simply take a seat at the back of the buss and I will wait so to inspect the quality degree of product support, that all major brands will deliver to their customers.

You must be unaware of any sort of irony
There is no good enough schools to teach us foreigners at the detection of sentiments due written text.
My advice to Americans, use neutral clear text as description of your true point which you are up to make.
TPU this is read internationally, this is not a neighborhood of Dallas - Texas
 
You are welcome, this is the old pack of OC - hacking a VGA how-to.
Lets return to today and latest edge of GPU architecture.
RTX 3080 due it high pricing this is now considered as investment.
NVIDIA did use additional tricks to protect it work (product) so to minimize the fail rate, it is extremely costly to handle an 1000 Euro worth of VGA card about return to base for an exchange.
I would not be impressed if the people later on will discover that even BIOS_Flash at those cards this is locked by password.

I wrote too much in this topic, now I will simply take a seat at the back of the buss and I will wait so to inspect the quality degree of product support, that all major brands will deliver to their customers.


There is no good enough schools to teach us foreigners at the detection of sentiments due written text.
My advice to Americans, use neutral clear text as description of your true point which you are up to make.
TPU this is read internationally, this is not a neighborhood of Dallas - Texas
My advice don't pull someone up for using AIBS and then rant to us telling us we have to use the same abbreviation you just pulled someone up for.
You are not the English language police, you can tell me how to do nothing, sir. ... .
And I'm English not American.
 
Only if you drink the Tu pricing koolaid. As for Samsung 8N all to themselves - if you define it that way, I guess...

3080 is like 90% faster than 1080 Ti, selling at the same price, this is a very sizeable performance gain for just 2 generations apart. If you skipped on Turing then Ampere is the logical upgrade from Pascal, which Jensen Huang did specifically pointed out during his presentation :D.

Yeah Turing was known for its terrible perf/dollar, lucky for Nvidia that Navi was not that much better anyways...
 
I just wonder how hard would it be to write a green team version of ATi Tray Tools with its built in overclock error monitoring tool? Nvidia could even purchase the software wholesale from Mr. Ray Adams. Not that big of a deal. There are people who would enjoy breaking the cards for them, a point of reference, try running 'vsync on' unless you want to break the solder joints too soon.
 
Back
Top