• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Radeon "Navi" OpenCL Bug Makes it Unfit for SETI@Home

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,682 (7.42/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
A bug with the Radeon RX 5700-series "Navi" OpenCL compute API ICD (installable client driver) is causing the GPUs to crunch incorrect results for distributed compute project SETI@Home. Since there are "many" Navi GPUs crunching the project cross-validating each others' incorrect results, the large volume of incorrect results are able to beat the platform's algorithm and passing statistical validation, "polluting" the SETI@Home database. Some volunteers at the SETI@Home forums, where the the issue is being discussed, advocate banning or limiting results from contributors using these GPUs, until AMD comes out with a fix for its OpenCL driver. SETI@Home is a distributed computing project run by SETI (Search for Extraterrestrial Intelligence), tapping into volunteers' compute power to make sense of radio waves from space.



View at TechPowerUp Main Site
 
Thank goodness for the Scientific Method. :roll:
 
LOL
SETI@Home is fun and all, but this is a general problem in OpenCL. There's a suggestion that Navi has bad FFT implementation.
So as of this moment Navi cards are unfit for almost all computing production systems... and rather pointless for development (even students).

And this shows up basically a week after W5700 launch.

Fun stuff.
 
Is this project still going on?
Wow
 
Navi gonna find aliens this way.
 
Thats a feature to stop non-gaming misusage.
 
LOL
SETI@Home is fun and all, but this is a general problem in OpenCL. There's a suggestion that Navi has bad FFT implementation.
So as of this moment Navi cards are unfit for almost all computing production systems... and rather pointless for development (even students).

And this shows up basically a week after W5700 launch.

Fun stuff.

Really?

Fourier Transform is one of the fundamentals for compute work. If AMD indeed screwed up its implementation at the hardware level, they would need a recall.
 
is a special bug inserted by aliens so we can't find them :roll:
 
as usual amd is late to the party anyhow.

2080TIs already found space invaders.
 
Since there are "many" Navi GPUs crunching the project cross-validating each others' incorrect results, the large volume of incorrect results are able to beat the platform's algorithm and passing statistical validation, "polluting" the SETI@Home database.

What ? Why ? That is by far the shittiest validation method I have ever heard of.
 
What ? Why ? That is by far the shittiest validation method I have ever heard of.
?
That's how science works. If most people on Earth do an experiment incorrectly, the bad result becomes statistically relevant (as in: not an obvious outlier).
There's no way to test this other than perform a different experiment of the same phenomenon.

In fact, that's why we're able to notice these issues in computational science.
There are different libraries that do equivalent math. And there are different CPUs and GPUs that we can compare.

If Navi was doing some computation incorrectly, but no other hardware was used, there would be no way to test for this error.
 
Low quality post by PanicLake
?
That's how science works. If most people on Earth do an experiment incorrectly, the bad result becomes statistically relevant (as in: not an obvious outlier).
There's no way to test this other than perform a different experiment of the same phenomenon.

In fact, that's why we're able to notice these issues in computational science.
There are different libraries that do equivalent math. And there are different CPUs and GPUs that we can compare.

If Navi was doing some computation incorrectly, but no other hardware was used, there would be no way to test for this error.
hold the phone, something i can agree from you? nahhh :p

the only way to test correctly is to use other hardware, iirc they try to not send the validation to a similar system. They wont just discard the data, they'll save it to send it out again. I do agree they should suspend the 5700s for the time being.
 
Yep, it is like saying Trump is a "nice person" because many people voted for him.

The way it works is you collect experimental and validation data within the same experiment, afterwards, when you have a model you use the validation data to test it and not the output of another model as it is pretty much the case here with the way SETI is testing these results.
 
They doesn't work at F@H either....
 
No surprise. OpenCL has been loosing developer interest for a long time. Small community, little resources, bugged GPU driver and etc.

This is the case for almost all “Open Standard” computation acceleration framework. Not a lot of researchers like to invest their money and human resources into such things due to fear of being ripped off by bigger fish since everything published will be fair game to use. It is a damn shame though. OpenCL would have been a great alternative to CUDA.
 
No surprise. OpenCL has been loosing developer interest for a long time. Small community, little resources, bugged GPU driver and etc.

This is the case for almost all “Open Standard” computation acceleration framework. Not a lot of researchers like to invest their money and human resources into such things due to fear of being ripped off by bigger fish since everything published will be fair game to use. It is a damn shame though. OpenCL would have been a great alternative to CUDA.

the Issue is Navi not Other AMD cards.this problem has nothing to do with OpenCL Driver or anything , Only Navi. Hold you breath.Man , Read all comments !!

I run a rx5700 and have noticed this issue. The task runs to completion and returns blatantly incorrect results. The only times when my rx5700 GPU gets a valid result is when it is validated against another AMD rx5700 series GPU (both gets the wrong result). I've currently stopped my computer from accepting GPU work units (it took me way too long to realize something was wrong, sorry). I believe this is an issue with the Navi architecture and not necessarily solely with AMD's OpenCL driver, as I see older AMD GPUs still returning "correct" results.

Someone has to redo all the work units where the results came from Navi AMD GPUs (RX5700, RX 5700XT, RX 5500M, RX 5500), and ban all AMD Navi GPUs until a fix is found.

Interestingly, my RX5700 has not been causing issues with other projects, like Einstein@home, Milkyway@home, Collatz, etc. Something about Navi and OpenCL really does not like Seti@home.

If any of you need any testing or logs on an AMD RX5700, hit me up.

edit: Corrected OpenGl to OpenCl, thanks Keith Myers
 
Hopefully the Adrenalin Pro drivers for the new Radeon Pro WX 5700 aren't affected by this, because this would be bad for its launch.

They probably prioritized fixing the random crashes in the drivers first before concentrating on GPGPU stuff.
 
They doesn't work at F@H either....
Until this is solved, we can safely assume Navi doesn't work in most popular computation scenarios.
Of course this can be fixed in software. Let's hope there will not be any performance penalty, because what would that mean for all the Navi supercomputers ordered? :D
 
Until this is solved, we can safely assume Navi doesn't work in most popular computation scenarios.
Of course this can be fixed in software. Let's hope there will not be any performance penalty, because what would that mean for all the Navi supercomputers ordered? :D
It's working fine with projects like Einstein@home, Milkyway@home, Collatz, etc. I know Seti@home isn't working fine. I'm not sure about F@H. And which supercomputers have ordered navi?
Vega is AMD's compute card atm. Arcturus is coming compute card, which is more similar to Vega than Navi.
 
LMAO ouch:

Keith Myers from SETI@home forums said:
The new Navi 5700 and 5700XT are useless for compute currently. The drivers are not ready for compute. All projects that rely on AMD OpenCL drivers are producing nothing but garbage results and invalids. The AMD developers and the Khronos group are aware of the problem but not a peep from either of them about what the real problem is or when to expect a fix. In the meantime, I think those cards should be banned until the drivers are fixed for all projects.

and:

Keith Myers from SETI@home forums said:
Phoronix did testing and reviews of the RX 5700XT and could not get the card and drivers to pass the OpenCL parts of their standardized test suite.
 
Could be pretty much due to running maths on a consumer graphics card instead of a pro version. Many of the Vega chips where initially designed as a PRO card but failed certain quality guidelines.
 
It's working fine with projects like Einstein@home, Milkyway@home, Collatz, etc. I know Seti@home isn't working fine. I'm not sure about F@H. And which supercomputers have ordered navi?
Vega is AMD's compute card atm. Arcturus is coming compute card, which is more similar to Vega than Navi.
Computing is not about funky distributed projects.
This problem was noticed in one of them because gamers already started using Navi (card for scientists/engineers was just announced and isn't used yet).

A GPU doesn't have a "calculate Seti@home" that doesn't work (while "calculate Einstein@home" does).
It makes errors in some math instruction that Einstein@home may not use. That's it.

As mentioned earlier: there's a possibility that FFT results are incorrect. FFT (Fast Fourier Transform) is a fundamental algorithm used for many problems. So the card is already almost useless for computing.
And another thing is about being reliable. It's obvious that AMD haven't properly tested this card, so there's really no reason to believe in other results. Everything will have to be tested by the clients... and there goes the "value".
 
Unacceptable, these cards are sold with a feature set as advertised. Failing the standards set forth as advertised is false advertising, and consumers of all types should receive the product they pay for.
 
Back
Top