• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

PGL Investigating GeForce RTX 4080 GPU Driver Crash, Following Esports Event Disruption

Lol, should have used enterprise GPU's with ECC memory :D

It sucks it might have cost them the win but shit happens, it's consumer hardware what were they expecting!?
 
Actually that’s exactly what they are saying. The delusion is high.
I have literally never seen that beyond obvious trolls.
 
If they were serious about stability, they would under-clock everything. Even then, as pointed out, solar flares. Run the game on a server processor with ECC memory. Get nVidia™ to enable ECC on the VRAM. Maybe even shielding the cases or building from high energy particles.

But no, the selection of hardware is just another ad space.
 
Last edited:
Crashes happen.

But when money (potential) is involved..............stuff gets real!! ;)
 
Lol, should have used enterprise GPU's with ECC memory :D

It sucks it might have cost them the win but shit happens, it's consumer hardware what were they expecting!?

4090's can enable ECC. Not sure if it would have saved the crash in this instance, but it could have helped.
 
If they were serious about stability, they would under-clock everything. Even then, as pointed out, solar flares. Run the game on a server processor with ECC memory. Get nVidia™ to enable ECC on the VRAM. Maybe even shielding the cases or building from high energy particles.
Like bit-flips from cosmic radiation during a time span of a few minutes would matter. :rolleyes:
Sounds fairly far fetched compared to other risks.

But what they could do is at least run the games on Linux (for the games which supports it), which is a much more stable OS than Windows, not to mention Nvidia's Linux driver is in all seriousness even more solid than their Windows counterpart. :cool:
 
Run the game on a server processor with ECC memory. Get nVidia™ to enable ECC on the VRAM

They woundn't even need server/workstation processors necessarily since they're using ryzen, and even with Intel they would just need to look for workstation grade boards (which wouldn't be that more expensive than the 600$+ top of the line motherboards they're probably using for looks and flair)

4090's can enable ECC. Not sure if it would have saved the crash in this instance, but it could have helped.

They're using 4080's, but 4090's come with ECC but with the error correction disabled!? That's so fucking stupid, they payed for the premium memory and simply disabled it out of spite? :wtf:

Like bit-flips from cosmic radiation during a time span of a few minutes would matter. :rolleyes:
Sounds fairly far fetched compared to other risks.

It's not just solar flares, on a live event environment like a stadium or whatever there's many sources of electric noise that is enough to push something that would otherwise be stable to crash. Even alone at home, ECC is the better solution if you're looking for stability and it's sad that thanks to mainly Intel this got mostly gatekeeped to workstations when it should be a standard feature.
 
It's not just solar flares, on a live event environment like a stadium or whatever there's many sources of electric noise that is enough to push something that would otherwise be stable to crash.
Even if the entire audience brings 10 cell phones each, the distance is going to make the electromagnetic noise from these insignificant in this regard. Anything that does have an affect must be very close and have a strong field.

Even alone at home, ECC is the better solution if you're looking for stability and it's sad that thanks to mainly Intel this got mostly gatekeeped to workstations when it should be a standard feature.
Don't get me wrong, I'm a big fan of ECC, and would strongly consider it for anything productive.
But the chances of ECC preventing crashes like this in this time frame is very unlikely.

There are much that could be hardened on the software side though, including drivers, the OS and how drivers work in Windows, and of course the games themselves (speaking generally, not this case specifically).
 
Even if the entire audience brings 10 cell phones each, the distance is going to make the electromagnetic noise from these insignificant in this regard. Anything that does have an affect must be very close and have a strong field.

The biggest problem is not cell phones, is all the lighting around the stage for example.

Of course a computer can crash for many and any number of reasons, but if they're doing a live event with money involved they should be using enterprise workstation gear, not consumer grade gaming stuff. Play stupid games win stupid prizes.
 
Whenever my PC crashes, I always call Jensen Huang personally, and demand he tells me why it crashed. And that he brings his leather jacket and fixes it immediately! :rolleyes:
(sarcasm)

So a PC crashed during a tournament, this happens all the time…


Depends on whether the problem is reproducible or not. If it is, and only on this particular PC, then it's a hardware issue. The organizers must be professional enough to have an image for the software setup for all the tournament PCs, so configuration issues should be eliminated. And they probably have spares if one fails.
If the hardware is not at fault, then it could be either the driver or a bug in the OS.
Either way, if this is an obscure and hard to reproduce bug, then I doubt the dumps from the BSOD is going to result in something useful.
To be objective, I don't disagree with what you said. A PC will crash for many reasons and in this case, it may be due to other factors that eventually caused the GPU to crash. I mean they confirmed that GPU crashed, so I am going along with the narrative. However, this is where I find very disturbing. Because when there is news of AMD GPU crash, the blame will immediately be on AMD. So not sure what is the difference here? In other word, AMD GPU/ driver crash = AMD is bad, while Nvidia GPU/ driver crash = must be other factors. See the hypocrisy here? Not directing this at you, but this is the general observation of people's reaction to these sorts of news.

If they were serious about stability, they would under-clock everything. Even then, as pointed out, solar flares. Run the game on a server processor with ECC memory. Get nVidia™ to enable ECC on the VRAM. Maybe even shielding the cases or building from high energy particles.

But no, the selection of hardware is just another ad space.
There is no perfection. You can do and plan the best that you can, but that does not mean it will go according to plan. Case in point here where they have "optimized" the PCs, but there is really nothing you can do when you don't know what will go wrong.
 
Because when there is news of AMD GPU crash, the blame will immediately be on AMD. So not sure what is the difference here? In other word, AMD GPU/ driver crash = AMD is bad, while Nvidia GPU/ driver crash = must be other factors. See the hypocrisy here? Not directing this at you, but this is the general observation of people's reaction to these sorts of news.
Please stop it with the straw man argument here. I've seen no one seriously making that argument, so there is no hypocrisy here.
It would be different if the article were phrased in a way to excuse Nvidia, but it isn't, in fact it only focuses on the possibility of the driver being responsible. So again, no hypocrisy.

There are three factors that could make the driver crash; the driver, the hardware or the OS. And this holds true for all GPU vendors. Notice I'm not listing the game, as a driver should handle a "misbehaving" application, so if a driver crashes due to a game bug, it's still a driver bug.

And I also still think this isn't newsworthy regardless, a PC crashed randomly during gaming, wow!
(If it happened during a special presentation, it would have been a little funny though.)
 
Last edited:
And I also still think this isn't newsworthy regardless, a PC crashed randomly during gaming, wow!
(If it happened during a special presentation, it would have been a little funny though.)

That's a classic and to be fair I don't even know if the drop in market value of Microsoft at the time isn't bigger than the prize money of this competition :D

But that's the thing, when there's money involved the stakes are much higher
 
Just highlights the difference between general computing, servers with failover, and life-critical computing for an airliner or chemotherapy machine.

The way consumer hardware and software is designed, bugs are an annoyance, not a mission critical event.
 
I feel like a PC crash outside of the player's control should not impact the results. Should require a rematch. Or a game that has proper multi-player checkpoint that can be restarted when >$1M is on the line.
 
Back
Top