• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

ATI 7970 Possible hardware Failure - Can it be fixed?

Joined
Apr 24, 2016
Messages
15 (0.00/day)
Hey Team TP!

Long time lurker, have used this forum on a number of times to resolve issues and other guidance of which I am looking for a little further help on my particular issue.

Sorry in advance for the long post to follow below, I just want to include all the info as possible to try and resove a possible component failure on my HIS 7970 GPU.

So what my symptoms appear to be is that one 1 x GPU (trying to XFire) is not displaying or failing to communicate to the screen or motherboard properly. When I installed the OS - Win10 as this is a new build it showed an Exclamation mark ! next to the 2nd GPU and as soon as the driver tried to update WIN10 would BSOD with Watch Dog or something along those errors

So I break out my tool kit and fire up atiflash and ATIFLASH reads both cards and VROMS fine in DOS. So I backed up the good one and also downloaded a ROM from here and successfully flashed it but still no display and will BSOD in if boot into Win10

If I remove power to the Slot 1 GPU card and try to boot PC it hands on POST and code is 2A and QLEDS show VGA fault and when i connect ROG Connect my ASUS board shows VGA BIOS POST and hangs there waiting..Motherboard is on the lastest BIOS as well and I have cleared CMOS a number of times. As this is a watercooled setup, I had to drianed the loop and moving the GPU's around to prove out motherboard slots.

All slots are OK with working GPU boots up ok & my PSU is 1200W so I know my hardware is good other than this GPU.

Now the rookie mistake I made was I bought the cards 2nd hand with EK waterblocks on them alreay and I failed to test them prior to settup my PC, so not comes the painful taks of trying to figure what is broken on the card.

So my question - If ATIFLASH can talk to the card, and flash it, etc and although BIOS reads the card only when the Slot 1 card is used for primary display as x8 instead of X16, then what is preventing it from posting and displaying on its own if the BIOS can be flashed and it appears to be hanging when the PC BIOS polls the graphics card?

I moved to another PC and still can't boot to a display with suspect card but going through each step to see whether the card is toast or not, I had to make up some PCI Express auxiliary power connectors to use on my trust HP workstation so I can test the card



Now that the power is good I tested the card in DOS and I can see it and flashed it with my working identical GPU card





Rebooting and loading up windows its not happy at all in device manager



Fire up WinATIFlash, doesn't really get much details from the card at all



After a few minutes, stupid Windoz tried to update the ATI diplay driver and PC locked up, had to cold boot and remove card to rollback display drv to the Nvidia GPU

I've also tried to move the Vbios switch to position 2 and I get the same results, will not display anything using the DVI cable.

Next I will pry off the waterblock and see if I can spot anything that could be causing it to not POST or past it's VGA test for it to be detected properly..

So what is left is baking it in the oven :upset: but I need to check if I have some flux around to do this..

I was hoping there was more i could do with ATIFLASH in DOS mode to strest test the card, but doesn't look like I can run something like MEMtest on a GPU to find whats causing the hanging etc

Anyone know what I can try while in DOS?

PEACE
Kosti
 
Are the GPUs identical that you are trying to xfire? If so, I'd try flashing the working bios from the working card to the faulty one. It's unlikely to do anything unless someone flashed the wrong bios on it it in the past (I see this as unlikely), but it's worth a try. Backup old vbios, as always.
 
NVidia quadro card and an AMD? might just be a driver issue, gotta pull the NV card and remove drivers to be sure.

but given that that is the 2nd PC, man, might be a duff'd card.
 
Why are you flashing a bios if it is a driver issue? :slap:
 
Why are you flashing a bios if it is a driver issue? :slap:

He bought the card's second hand I think. If so, they could have the wrong bios on one: These were incredibly popular cards for bitcoin miners who were known to flash highly tweaked but otherwise useless for gaming mining bioses.

Either way, this really sounds like a hardware issue to me.
 
I get that he has two the same and one isn't recognized, even after flashing, but trying to boot from it in another system with an Nvidia card without pulling that card is a bit silly, it causes driver conflicts.
I would try swapping the working card in to the secondary slot and the other into the primary, after trying it in the primary by itself first in the original board.
 
I get that he has two the same and one isn't recognized, even after flashing, but trying to boot from it in another system with an Nvidia card without pulling that card is a bit silly, it causes driver conflicts.
I would try swapping the working card in to the secondary slot and the other into the primary, after trying it in the primary by itself first in the original board.

I second this. Good idea. I forgot how nvidia and AMD like to play mean with each others drivers.
 
My question is, is the card actually hooked into a water loop? Or are you trying to cool it with a fan blowing on the water block and no coolant running through it? What are both cards?
 
Try and wipe the BIOS and then flash the new one on.
 
Hi All

Thanks for the comments and posts and I realise there is a lot to read and generally hard to regurgitate the whole events but let me summarize

  • Card bought 2nd hand and are identical with same waterblocks - YES I forgot to test them when I got them and sat around for a long time until just now
  • Only one card seems to be working fine
  • ATIFlash can talk to BOTH GPU's I assume it only talks to the vbios chip EPROM
  • Was build for waterloop - drain to split cards in order to test and move them around in original PC - result 1 GPU do not post
  • Card was flashed with downloaded ROM baseed on GPUZ identified card and then grabbed the BIOS from the working card, and use this as well to flash vbios - no change - card does not POST
  • Removed completely out of my new watercooled system and tested in HP workstation which has an Nvidia card
  • Working GPU card works fine in HP work station and boots and posts no problem with display or drviers - yes no water in them and only allowed to boot and post with FANS on waterblock no more than 1min
  • Suspect GPU does NOT work in either systems even though partial detected in BIOS and ATIFLASH can see vbios
  • I believe the issue is not BIOS related its component hardware issue preventing the card to finish it's own vBIOS post check
So these points above should answer all the above comments and confusion

So I removed the waterblock and maybe I found the cause....



No bent pins in display ports




However taking a closer look at the chokes shows signs of breakdown on the inductors




Perhaps these had serious coil whine and maybe this is part of the problem - I am not sure if they were used in mining, maybe but I am trying to find out how to fix it??? I suspect the card is partially working since ATIFlash can access the vbios, one interesting not was with the waterblock completely removed I attempted to boot, but it refuses there for the WB must provide grounding to the shroud of the GPU's CPU hence no boot, I was hoping it maybe was a simple short cct, but now after seeing the inductor I suspect this is part of the problem

I tested the coil and it's not an open cct so that's a good sign, but what else is causing it to not complete it's post?
 
Personally, the mining idea was a longshot I just don't know what else to test. Very few miners used watercooling (and I'm a former miner, so I should know).
 
If I can find another one of those coils - Coiltronics ferrite core low profile surface mount inductors (1007R3-R15) then maybe i can try and replace it and see if it boots? If anyone knows where I could source one that would be great, or if anyone had a stuffed up 7970 GPU that i could steal parts off...

Alternatively I may need to try and bake it in the oven and see if that helps, but I doubt it as since vbios can be written to it and that 2 system partially detect it, its definitely related to something not passing in the vbios POST instructions

Are there any programs out there at a DOS level that can perform some basic testing of a GPU as atiflash doesn't seem to have such options

Cheers
Kostiz
 
Those Sharpe marks on the CAPS indicate someone was trouble shooting it on a bench, they went through and tested each cap(how I don't know as you can only test caps out of a board). As Caring 1 said put the bad card in first PCI-e slot(water cooling I know) and known good card in the second position. You'll need to do a blind flash, I have heard of people saving them but not done it my self. Think someone said if it's a bad flash then you are SOL(Shit Out of Luck), if your not comfy doing it or even experienced then you need someone who is and has done it already.

"I tested the coil and it's not an open cct so that's a good sign, but what else is causing it to not complete it's post?"

Why are you wanting to replace the coil? if they were whinning its a sign it was working....

"maybe but I am trying to find out how to fix it???" Sell it on Flee bay "AS IS" be done with it!! How do you know its not a 7950?
 
Last edited:
bake it, you have nothing to lose at this point
385F 10M
inducers look fine they are just a little dirty
 
The reason for the inductor as being a suspect is if you look at the coil there's what appears to be like a stain/shadow mark it actually has cracks/chipped on it which doesn't show very well in the picture and because of this discolouring/chip it looks damaged hence why I suspect its part of the problem..this component is part of the VRM power supply filtering for the GPU, so this makes sense if its broken it could stop the GPU from post/boot/display

@jagger - those sharpe marks are factor QA checking - blind flashing is an option but at this stage not warranted as I do not believe its a corrupt BIOS issue - I say I suspected the card must have whined but I've never heard it because the card has never worked in my possession
 
The reason for the inductor as being a suspect is if you look at the coil there's what appears to be like a stain/shadow mark it actually has cracks/chipped on it which doesn't show very well in the picture and because of this discolouring/chip it looks damaged hence why I suspect its part of the problem..this component is part of the VRM power supply filtering for the GPU, so this makes sense if its broken it could stop the GPU from post/boot/display

@jagger - those sharpe marks are factor QA checking - blind flashing is an option but at this stage not warranted as I do not believe its a corrupt BIOS issue - I say I suspected the card must have whined but I've never heard it because the card has never worked in my possession
if it was damaged it would not work at all they either work or they don't (the little chip on the plasic casing means jackshit Iv seen them crushed open and they still work fine)
coil whine means nothing
odds are you flexed the board a little too much taking the waterblock off and broke a solder joint
bake it and it should be good to go
 
Last edited:
Reflowing it assumes it a dead GPU witch is is NOT, so you'll probably kill it for sure then. That chip is fine, but you know it all. So fix the damn thing if you know those are QA checks, I've never seen Sharpe marks on a video card that was working. But what do I know, your teaching me so much. I'm a sit here n learn :rolleyes:
 
steps for proper gpu-bakage
pre-heat a oven to 385F
scrub card down with rubbing alcohol ( and yes I mean get a old tooth bruss and CLEAN IT)
make your self 6 tin-foil balls about the size of gumball
set the card on a cookie sheet or pizza pan (make sure its not all warped to shit or you will have a bad day) and place the balls under the card so its elevated off the pan (gpu core up please )
put the card in the oven and bake for 8 to 10M ( I usually turn the temp up to 400 ) for the last 30 seconds or so DO NOT WALK AWAY FROM IT!
turn the oven OFF and open the door and allow the card to partinally cool before moving
re-assemble
profit
 
Reflowing it assumes it a dead GPU witch is is NOT, so you'll probably kill it for sure then. That chip is fine, but you know it all. So fix the damn thing if you know those are QA checks, I've never seen Sharpe marks on a video card that was working. But what do I know, your teaching me so much. I'm a sit here n learn :rolleyes:

Hey Jagger

No need for that, I'm asking for help not sarcasm - i get plenty of that from my wife...however a simple google search will show you some of these cards out there are marked by sharpies, so unless they have all been repaired by the same guy and sold over the world.....like you said - u sit and learn :rolleyes:

EDIT - I found a brand new card taken apart here with sharpies..Mmmm

http://www.techpowerup.com/forums/threads/reference-radeon-r9-290x-taken-apart.192491/

@ onemoar - great thanks for the baking steps, I will try a few more things first in terms of trying to test some voltage points to see if there are any differences in some test points.

The difficulty being as I have these out of the water loop, trying to hook them up without watercooling could cause further problems although if I can source original HSF shrouds to convert them back to air cooled (I bought these without the original HSF) then maybe I can prop up a test jig to isolate for testing..
 
Last edited:
Hey Jagger

No need for that, I'm asking for help not sarcasm - i get plenty of that from my wife...however a simple google search will show you some of these cards out there are marked by sharpies, so unless they have all been repaired by the same guy and sold over the world.....like you said - u sit and learn :rolleyes:

EDIT - I found a brand new card taken apart here with sharpies..Mmmm

http://www.techpowerup.com/forums/threads/reference-radeon-r9-290x-taken-apart.192491/

@ onemoar - great thanks for the baking steps, I will try a few more things first in terms of trying to test some voltage points to see if there are any differences in some test points.

The difficulty being as I have these out of the water loop, trying to hook them up without water-cooling could cause further problems although if I can source original HSF shrouds to convert them back to air cooled (I bought these without the original HSF) then maybe I can prop up a test jig to isolate for testing..
I 100% guarantee you that there is nothing wrong with those regs or chokes when they go they go BOOM!
as for cooling just put the block back on it put some water in the block and power it up you aren't gonna be running it that long if it picks it up in the device manager then you know its ok and you can reconnect the loop to it


you would have smoke and burnt pcb every-ware if a failed reg was causing your issue you either have a bad solder joint or the core is simply toast it showing up in the device manger as a basic display adapter means its in `limp` mode usually that means the core didn't past its POST checks
you gotta realize there is a 10 to 15A of inrush current running though those vrms card when it posts if something was bad it would result in smoke and burned pcb
and the load current is greater still at 80 to 100A under full load
 
Last edited:


...one interesting not was with the waterblock completely removed I attempted to boot, but it refuses there for the WB must provide grounding to the shroud of the GPU's CPU hence no boot, I was hoping it maybe was a simple short cct
Maybe because it is. I could have sworn when I read this thread before there was a part where you explained finding that ghastly TIM overdose. And that you'd cleaned it up. But going back over it now I'm not seeing where you said that. So, correct me if I'm wrong but, you didn't did you. And you've eliminated the possibilty of it being the cause of the issue how? It's so glaringly obvious to me that it is. The simple fact that removing the block made a difference is screaming this to you. You need to clean all that shit off THOROUGHLY. And I mean triple super clean. Then get in there with a sewing needle and get the rest of it. Then redo the TIM. Put the block back on and watch the card "magically" come back to life. That is if there's any hope left at all. Which I bet there is.

Good luck!

Rookies....
 
i had a 7970 with the same leak and it works like charm.
 
I cleaned up the TIM and added it back in the loop, no change, card is not completely recognised and it will not POST when set as primary.

I agree with Onemoar in that this is in some sort of LIMP mode, but i cannot find the cause... YET! Even tried added a different waterblock just incase this one was causing issues like a short cct but same result.

ATIflash can read it in DOS, but when set as primary card to boot (PEG) no display and the interesting part is when this is in the HP test system if I set this 7970 as primary as soon as it seems to go through it's initial mainboard BIOS checks all fans go full speed so this is another indication that the GPU is not happy and triggering this LIMP mode to indicate that something isn't right..

Time to bake it
 
Back
Top