• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

470-780 VBIOS mod utility: fix old nvidia artifacts/code 43 by disabling bad parts

Joined
Oct 22, 2020
Messages
90 (0.05/day)
1679015097389.png

Theory
Do you remember or own that old NVIDIA 470-780 GPU that is artifacting or fails to load driver with "code 43"?
A lot of such errors are caused by a damaged GPU↔VRAM connectivity.
For Fermi-Kepler generations, many of such cases are caused by the connectivity breakage from the GPU chip side.

I’ve created a VBIOS modification utility that performs disabling of a problematic memory channel.
This allows to quickly fix some old NVIDIAs - Fermi GTX470+, all Kepler GTX6xx-GTX7xx, and GM107=750Ti.

It is ~10 years late)) but I hope that resurrected 770s/780s still can be useful. GTX Titan 6GB is also supported.
Quadro/Kepler/Grid/Mobile cards of the same generations were not tested, but in theory if they can be accessed by nvflash - they can be fixed too.

"Old NVIDIA artifacts" utility
1679016834923.png


Usually everything works out without special knowledge in 15 minutes:
① Run utility the first time to flash first testing VBIOS, reboot​
② ③ ④ ⑤ Run the utility again and tell it if the GPU is working fine with current VBIOS. It flashes another VBIOS variant and reboots again, etc...​

For some GPUs there are some non-obvious points described in the user guide. Or you can just download .zip with utility.

Testing in local community showed that a lot of GPUs are fixed, some others are not, but there is a ~5% chance for getting black screen after first reboot that can be hard to flash back (however not much to lose for artifacting cards). So, use the tool at your own risk.

If modification succeeds after several reboots - a pair of GPU memory chips are disabled and the bus width is reduced by 64 bits. Fixed GPUs can be used in any computer



1679017154292.png
Performance tests

For GPUs with 128 bits the performance drop is significant, but for cards with a wide memory bus the difference is quite small.


The standard 3GB 384bit 780Ti GHz Edition on average achieves 3700 Graphics score in the TimeSpy benchmark. The fixed 780Ti with 320bit bus and 2.5GB left gives 3400 Graphics score and a SLI 2×780Ti 2.5GB gives 6100 Graphics score in TimeSpy.


The utility is small&free, but the source code is not public for now.
If you are just interested what is modified in the VBIOS - just several bytes, you can see it in the
.zip with VBIOS variants for disabling different channels on Titan Kepler 6GB
 
I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled
 
I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled
Good thing it says Fermi and Kepler in the title, eh?
 
Good thing it says Fermi and Kepler in the title, eh?
I know it stated that but it short on other gtx series..... Would be nice if it had support for maxwell
 
Last edited:
I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled
About 980ti: Unfortunately, Maxwell2 GPUs are not supported. The "devinit script" that is edited by utility do present in a same form in all cards from Maxwell1 to Turing.

However, only GTX750Ti Maxwell1 boots fine after modding it. When such edited VBIOS is flashed onto the Maxwell2 cards - they just go black screen.
It seems that starting with Maxwell2 the "devinit script" part of VBIOS is somehow verified.

If you are ready to some "flash original VBIOS back" quest - you can even perform such experiment yourself.
Save your 980ti VBIOS, run the utility and use the "Open original VBIOS file" button to generate VBIOS variants
1680595922978.png

It will create several VBIOS files in the subfolder (similar to this example .zip for GTX Titan 6GB). Then use the nvflash.exe from details subdirectory to flash DisableE variant - that nvflash.exe is capable to flash modified VBIOS on Maxwell2 GPUs.

Then reboot - I expect that 980ti will not show any picture after such modding. And note: depending on the motherboard it may be some quest to flash original VBIOS on it again - the MB need the capability to force use builtin iGPU boot with monitor plugged in even when the buggy 980ti is attached too. Some advices and typical Motherboard parameters are described in troubleshooting section of the utility guide.


Unrelated, but does OP know a way to unlock "disabled" CUDA Cores like Dell/Alienware this
The count of CUDA-cores physically present on a die is identical for all chips from some family (like GA104). However, for several reasons (including the non-working cores) some of them are disabled - this is performed by disabling a whole TPC. The disabling is done in two steps:

  1. Some TPCs are disabled at factory, at the same moment when the chip DeviceId is assigned. So, as far as I understand, the cores disabled at factory should be identical for all GPUs having same first half of device ID (10DE 24DD for mentioned Alienware M15 R5). Also, typically each device ID corresponds to a physically visible GPU marking, like GA104-770-A1 for 10DE 24DD. This marking is effectively a part number, having identical DeviceId and identical count of factory-disabled TPCs.
    While the count of disabled TPCs is identical, and each TPC having identical number of Cuda-cores - for some GPUs the performance may slightly differ depending on how they are disabled! I verified that on the Pascal family of GPUs, but I think that it may be a case for later GPUs too. The difference is caused by the fact that while having identical sum number of TPCs, the enabled ones may have different distribution between GPCs - sometimes GPCs have nearly identical number of TPCs enabled, and sometimes some GPCs are all TPCs enabled, while others have only single TPC active. The performance difference is not critical, but it was easily noticed 2-4% while comparing the crypto-mining performance of identical P102-100 GPUs running on identical fixed clocks. I suppose it may be caused by overloading/underloading of some internal GPU buses.
  2. And some more TPCs (or even a whole GPC) can be disabled at the VBIOS level (for a single boot). This was the case for the "94.04.42.00.97" VBIOS; the subvendor ID - second half of device ID (1028 0A97) is set at VBIOS level too.
    As far as I know, VBIOS-level disabling is quite rare for Nvidia, and only such cores can be re-enabled.


I had zero practice with laptops, so the text below is just some theory, without being proved by my own practice.



Unfortunately I don't know a way to edit existing 30xx series VBIOS and getting it accepted by the GPU. But generally, flashing the VBIOS with same device ID (10DE 24DD) should be possible (with some modded nvflash or a hardware programmer), and, say 20%-to 50% of laptop VBIOSes with same IDs in average are compatible. The other 50-80% are not, so before such experimenting with laptop GPU - you should be ready for flashing saved original VBIOS back with a hardware programmer. And there maybe some other laptop-specific issues I'm not familiar with - so as always - flashing non-factory VBIOSes is at your own risk.



So, if you see that some laptop with 10DE 24DD XXXX XXXX GPU having less Cuda-cores than another laptop with 10DE 24DD XXXX XXXX GPU (XXXX may differ) you may try to flash another VBIOS on the laptop with lower Cuda-cores. The M15 R5 has exactly that case - with "94.04.42.00.97" VBIOS it showed 4608 Cuda-cores https://www.techpowerup.com/img/YOLmOeQrWxb2A38y.jpg while the GA104 description on Techpowerup tells that GA104-770-A1 should have 5120 cores.


All available VBIOS variants for a specified DevicedID can be searched on Techpowerup by a manually created URLs: (this is a really useful feature, great thanks to Techpowerup!)

Unfortunately, the VBIOS page doesn't provide info that allows to estimate the effective count CUDA cores and the effective compatibility. So some researchers just flash then one by one to determine the actual behavior.
 
Last edited:
Thank you for making this program. Please explain what is the difference between DisableA and DisableGpuA rom files generated by the Expert mode? Which of these two should be flashed to disable the A channel?
 
The DisableX variant disables only X memory channel. Most of the time it is what helps.

The DisableGpuX variant additionally disables the "GPC" (Graphics processing cluster that consists of shader units) not directly related to memory channel with index corresponding to letter X:
  • DisableGpuA - disables mem channel A and GPC 0
  • DisableGpuB - disables mem channel B and GPC 1
  • etc...

Having broken GPC is a much more rare situation than having broken memory channel, so the support for disabling GPCs was not planned initially and is released as an incomplete/unsupported/last-resort feature.

However, one of the youtube users ran into exactly that situation with GTX 560 - broken GPC. Disabling memory channel didn't fix that GPU issue, but additionally disabling GPC - did.
The artifacts that still present with 192-bit memory bus and full 336 shaders and disappeared after a GPC was disabled as a next step suggested by the utility.
and 30 seconds from there.

That GTX 560 has only 2 GPCs and most of the shader blocks were placed on the disabled GPC. So only 144 shaders left. GPUs with more GPCs have less catastrophic effect on shader count.
 
Crippling the gpu...
 
How come people have gpu's with broken memory channel?
 
How come people have gpu's with broken memory channel?
Video cards are not supposed to work for decades. They eventually run into damage to their very hot and intensely used components, not to mention how manufacturers love to spend as little as possible so you get the worst possible components which get out of order even faster. This leads to one or multiple CUs, memory lanes, PCI-E lanes, etc being not accessible, thus needing to be disabled.

GTX 700s and older cards are now 10+ years old. They are more than just archaic. No surprise some of them are not working properly.
 
A policy of deliberately planning or designing a product with a finite lifespan, so it will become obsolete or non-functional after a certain period. Planned obsolescence is often used to tempt the customer to purchase again. Cars, computers, and software are good examples of products with built-in obsolescence.

Above is the actual fact of modern standards This is why everything in the last 10 to 20 years are left behind and not supported by repair services due to be not cost effective by any means only a hobbyist who has the time to repair it themselves for a small fee depends on how long it takes to fix one. These people can be found on discord depending on location. Anything below gtx900 series is only a waste of time to be honest
 
Last edited:
Video cards are not supposed to work for decades. They eventually run into damage to their very hot and intensely used components, not to mention how manufacturers love to spend as little as possible so you get the worst possible components which get out of order even faster. This leads to one or multiple CUs, memory lanes, PCI-E lanes, etc being not accessible, thus needing to be disabled.

GTX 700s and older cards are now 10+ years old. They are more than just archaic. No surprise some of them are not working properly.
Interesting, my gtx 460 still works fine.
 
It saved my EVGA 780 Classified, I am very grateful, thank you very much!
 
Unrelated, but does OP know a way to unlock "disabled" CUDA Cores like Dell/Alienware this
As they already said, and while another decade out of date, up until the Geforce 6 series turning off parts was done pretty much on a software level, so you could "unlock" them with rivatuner. And in some cases just flash a card over (HD 6850 to 6870 anyone?)
Nowadays that is done deeper, so in most cases it isn't possible.

Oh, and my old 660 Ti is also still running.
 
Thanks man, this just saved my 770 from going down the bin. I have a question, can I use a newer driver, or is only 390/391 usable?
 
Standard GTX 770 and modded VBIOS one has same requirements - you can use any driver up to 470 branch. The 495 and newer are not compatble with Kelper Gpus.
 
Back
Top