470-780 VBIOS mod utility: fix old nvidia artifacts/code 43 by disabling bad parts

StViolenceDay · Mar 17, 2023

Theory
Do you remember or own that old NVIDIA 470-780 GPU that is artifacting or fails to load driver with "code 43"?
A lot of such errors are caused by a damaged GPU↔VRAM connectivity.
For Fermi-Kepler generations, many of such cases are caused by the connectivity breakage from the GPU chip side.

I’ve created a VBIOS modification utility that performs disabling of a problematic memory channel.
This allows to quickly fix some old NVIDIAs - Fermi GTX470+, all Kepler GTX6xx-GTX7xx, and GM107=750Ti.

It is ~10 years late)) but I hope that resurrected 770s/780s still can be useful. GTX Titan 6GB is also supported.
Quadro/Kepler/Grid/Mobile cards of the same generations were not tested, but in theory if they can be accessed by nvflash - they can be fixed too.

"Old NVIDIA artifacts" utility

Usually everything works out without special knowledge in 15 minutes:

① Run utility the first time to flash first testing VBIOS, reboot

② ③ ④ ⑤ Run the utility again and tell it if the GPU is working fine with current VBIOS. It flashes another VBIOS variant and reboots again, etc...

For some GPUs there are some non-obvious points described in the user guide. Or you can just download .zip with utility.

Testing in local community showed that a lot of GPUs are fixed, some others are not, but there is a ~5% chance for getting black screen after first reboot that can be hard to flash back (however not much to lose for artifacting cards). So, use the tool at your own risk.

If modification succeeds after several reboots - a pair of GPU memory chips are disabled and the bus width is reduced by 64 bits. Fixed GPUs can be used in any computer

Performance tests
For GPUs with 128 bits the performance drop is significant, but for cards with a wide memory bus the difference is quite small.

The standard 3GB 384bit 780Ti GHz Edition on average achieves 3700 Graphics score in the TimeSpy benchmark. The fixed 780Ti with 320bit bus and 2.5GB left gives 3400 Graphics score and a SLI 2×780Ti 2.5GB gives 6100 Graphics score in TimeSpy.

The utility is small&free, but the source code is not public for now.
If you are just interested what is modified in the VBIOS - just several bytes, you can see it in the
.zip with VBIOS variants for disabling different channels on Titan Kepler 6GB

Edwired · Apr 4, 2023

I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled

Toothless · Apr 4, 2023

Edwired said:
I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled

Good thing it says Fermi and Kepler in the title, eh?

Edwired · Apr 4, 2023

Toothless said:
Good thing it says Fermi and Kepler in the title, eh?

I know it stated that but it short on other gtx series..... Would be nice if it had support for maxwell

2wick_10029 · Apr 4, 2023

Unrelated, but does OP know a way to unlock "disabled" CUDA Cores like Dell/Alienware this

https://www.reddit.com/r/Alienware/comments/nxigoa

StViolenceDay · Apr 4, 2023

Edwired said:
I have seen this a few weeks ago only problem is it don't support gtx 980 ti just wish the author can edit the file to support this as I would try this again just to test the gtx 980 ti for core issues while the E1 and F0 bank temporary disabled

About 980ti: Unfortunately, Maxwell2 GPUs are not supported. The "devinit script" that is edited by utility do present in a same form in all cards from Maxwell1 to Turing.

However, only GTX750Ti Maxwell1 boots fine after modding it. When such edited VBIOS is flashed onto the Maxwell2 cards - they just go black screen.
It seems that starting with Maxwell2 the "devinit script" part of VBIOS is somehow verified.

If you are ready to some "flash original VBIOS back" quest - you can even perform such experiment yourself.
Save your 980ti VBIOS, run the utility and use the "Open original VBIOS file" button to generate VBIOS variants

It will create several VBIOS files in the subfolder (similar to this example .zip for GTX Titan 6GB). Then use the nvflash.exe from details subdirectory to flash DisableE variant - that nvflash.exe is capable to flash modified VBIOS on Maxwell2 GPUs.

Then reboot - I expect that 980ti will not show any picture after such modding. And note: depending on the motherboard it may be some quest to flash original VBIOS on it again - the MB need the capability to force use builtin iGPU boot with monitor plugged in even when the buggy 980ti is attached too. Some advices and typical Motherboard parameters are described in troubleshooting section of the utility guide.

vietanh2901 said:
Unrelated, but does OP know a way to unlock "disabled" CUDA Cores like Dell/Alienware this

The count of CUDA-cores physically present on a die is identical for all chips from some family (like GA104). However, for several reasons (including the non-working cores) some of them are disabled - this is performed by disabling a whole TPC. The disabling is done in two steps:

Some TPCs are disabled at factory, at the same moment when the chip DeviceId is assigned. So, as far as I understand, the cores disabled at factory should be identical for all GPUs having same first half of device ID (10DE 24DD for mentioned Alienware M15 R5). Also, typically each device ID corresponds to a physically visible GPU marking, like GA104-770-A1 for 10DE 24DD. This marking is effectively a part number, having identical DeviceId and identical count of factory-disabled TPCs.
While the count of disabled TPCs is identical, and each TPC having identical number of Cuda-cores - for some GPUs the performance may slightly differ depending on how they are disabled! I verified that on the Pascal family of GPUs, but I think that it may be a case for later GPUs too. The difference is caused by the fact that while having identical sum number of TPCs, the enabled ones may have different distribution between GPCs - sometimes GPCs have nearly identical number of TPCs enabled, and sometimes some GPCs are all TPCs enabled, while others have only single TPC active. The performance difference is not critical, but it was easily noticed 2-4% while comparing the crypto-mining performance of identical P102-100 GPUs running on identical fixed clocks. I suppose it may be caused by overloading/underloading of some internal GPU buses.
And some more TPCs (or even a whole GPC) can be disabled at the VBIOS level (for a single boot). This was the case for the "94.04.42.00.97" VBIOS; the subvendor ID - second half of device ID (1028 0A97) is set at VBIOS level too.
As far as I know, VBIOS-level disabling is quite rare for Nvidia, and only such cores can be re-enabled.

I had zero practice with laptops, so the text below is just some theory, without being proved by my own practice.

Unfortunately I don't know a way to edit existing 30xx series VBIOS and getting it accepted by the GPU. But generally, flashing the VBIOS with same device ID (10DE 24DD) should be possible (with some modded nvflash or a hardware programmer), and, say 20%-to 50% of laptop VBIOSes with same IDs in average are compatible. The other 50-80% are not, so before such experimenting with laptop GPU - you should be ready for flashing saved original VBIOS back with a hardware programmer. And there maybe some other laptop-specific issues I'm not familiar with - so as always - flashing non-factory VBIOSes is at your own risk.

So, if you see that some laptop with 10DE 24DD XXXX XXXX GPU having less Cuda-cores than another laptop with 10DE 24DD XXXX XXXX GPU (XXXX may differ) you may try to flash another VBIOS on the laptop with lower Cuda-cores. The M15 R5 has exactly that case - with "94.04.42.00.97" VBIOS it showed 4608 Cuda-cores https://www.techpowerup.com/img/YOLmOeQrWxb2A38y.jpg while the GA104 description on Techpowerup tells that GA104-770-A1 should have 5120 cores.

All available VBIOS variants for a specified DevicedID can be searched on Techpowerup by a manually created URLs: (this is a really useful feature, great thanks to Techpowerup!)

NVIDIA VBIOSes officially published by Techpowerup https://www.techpowerup.com/vgabios/?did=10DE-24DD--
NVIDIA VBIOSes published by users https://www.techpowerup.com/vgabios/?architecture=Uploads&did=10DE-24DD--

Unfortunately, the VBIOS page doesn't provide info that allows to estimate the effective count CUDA cores and the effective compatibility. So some researchers just flash then one by one to determine the actual behavior.

ribcage · May 17, 2023

Thank you for making this program. Please explain what is the difference between DisableA and DisableGpuA rom files generated by the Expert mode? Which of these two should be flashed to disable the A channel?

StViolenceDay · May 17, 2023

The DisableX variant disables only X memory channel. Most of the time it is what helps.

The DisableGpuX variant additionally disables the "GPC" (Graphics processing cluster that consists of shader units) not directly related to memory channel with index corresponding to letter X:

DisableGpuA - disables mem channel A and GPC 0
DisableGpuB - disables mem channel B and GPC 1
etc...

Having broken GPC is a much more rare situation than having broken memory channel, so the support for disabling GPCs was not planned initially and is released as an incomplete/unsupported/last-resort feature.

However, one of the youtube users ran into exactly that situation with GTX 560 - broken GPC. Disabling memory channel didn't fix that GPU issue, but additionally disabling GPC - did.
The artifacts that still present with 192-bit memory bus and full 336 shaders and disappeared after a GPC was disabled as a next step suggested by the utility.

and 30 seconds from there.

That GTX 560 has only 2 GPCs and most of the shader blocks were placed on the disabled GPC. So only 144 shaders left. GPUs with more GPCs have less catastrophic effect on shader count.

eidairaman1 · May 25, 2023

Crippling the gpu...

chrcoluk · May 25, 2023

How come people have gpu's with broken memory channel?

Macro Device · May 25, 2023

chrcoluk said:
How come people have gpu's with broken memory channel?

Video cards are not supposed to work for decades. They eventually run into damage to their very hot and intensely used components, not to mention how manufacturers love to spend as little as possible so you get the worst possible components which get out of order even faster. This leads to one or multiple CUs, memory lanes, PCI-E lanes, etc being not accessible, thus needing to be disabled.

GTX 700s and older cards are now 10+ years old. They are more than just archaic. No surprise some of them are not working properly.

Edwired · May 25, 2023

A policy of deliberately planning or designing a product with a finite lifespan, so it will become obsolete or non-functional after a certain period. Planned obsolescence is often used to tempt the customer to purchase again. Cars, computers, and software are good examples of products with built-in obsolescence.

Above is the actual fact of modern standards This is why everything in the last 10 to 20 years are left behind and not supported by repair services due to be not cost effective by any means only a hobbyist who has the time to repair it themselves for a small fee depends on how long it takes to fix one. These people can be found on discord depending on location. Anything below gtx900 series is only a waste of time to be honest

chrcoluk · May 26, 2023

Beginner Micro Device said:
Video cards are not supposed to work for decades. They eventually run into damage to their very hot and intensely used components, not to mention how manufacturers love to spend as little as possible so you get the worst possible components which get out of order even faster. This leads to one or multiple CUs, memory lanes, PCI-E lanes, etc being not accessible, thus needing to be disabled.

GTX 700s and older cards are now 10+ years old. They are more than just archaic. No surprise some of them are not working properly.

Interesting, my gtx 460 still works fine.

Macro Device · May 26, 2023

chrcoluk said:
Interesting, my gtx 460 still works fine.

Sheer luck to be honest. Most of 400 series are now long gone.

ribcage · Jul 16, 2023

Fixed an artifacting GTX 580. Used the VariantA.

xvnqa · Nov 23, 2023

It saved my EVGA 780 Classified, I am very grateful, thank you very much!

lordmogul · Nov 23, 2023

vietanh2901 said:
Unrelated, but does OP know a way to unlock "disabled" CUDA Cores like Dell/Alienware this
https://www.reddit.com/r/Alienware/comments/nxigoa

As they already said, and while another decade out of date, up until the Geforce 6 series turning off parts was done pretty much on a software level, so you could "unlock" them with rivatuner. And in some cases just flash a card over (HD 6850 to 6870 anyone?)
Nowadays that is done deeper, so in most cases it isn't possible.

Oh, and my old 660 Ti is also still running.

totalmentemati · Jan 18, 2024

Thanks man, this just saved my 770 from going down the bin. I have a question, can I use a newer driver, or is only 390/391 usable?

StViolenceDay · Feb 8, 2024

Standard GTX 770 and modded VBIOS one has same requirements - you can use any driver up to 470 branch. The 495 and newer are not compatble with Kelper Gpus.

System Name	The Blind Grim Reaper
Processor	Xeon X5675 Westmere-EP B1 SLBYL 4.20ghz @ 1.256v
Motherboard	Asus P6X58D-E
Cooling	Noctua CP12 SE14, Redux Noctua 1500rpm fan Arctic F14 x3 for intake and exhaust
Memory	Corsair XMS3 CMX4GX3M2A1600C9 x6
Video Card(s)	EVGA GTX 1060 6GB SC Single Fan Model
Storage	Crucial mx300 750gb main system + 1TB mx500 for games and music
Display(s)	22 inch samsung curved
Case	NZXT Phantom 530 black
Audio Device(s)	Nvidia HDMI through HDMI adaptor for output sound for turtlebeach x12 headset
Power Supply	Antec HCG 850 watt
Mouse	no brand
Keyboard	normal usb keyboard
Software	Windows 10 22H2 v1 (main is) and Windows 11 22H2 v2 on WD 250gb 7200rpm (testing purposes os)
Benchmark Scores	Cinebench R20 = 2046cb

System Name	Veral
Processor	7800x3D
Motherboard	x670e Asus Crosshair Hero
Cooling	Thermalright Phantom Spirit 120 EVO
Memory	2x24 Klevv Cras V RGB
Video Card(s)	Powercolor 7900XTX Red Devil
Storage	Crucial P5 Plus 1TB, Samsung 980 1TB, Teamgroup MP34 4TB
Display(s)	Acer Nitro XZ342CK Pbmiiphx, 2x AOC 2425W, AOC I1601FWUX
Case	Fractal Design Meshify Lite 2
Audio Device(s)	Blue Yeti + SteelSeries Arctis 5 / Samsung HW-T550
Power Supply	Corsair HX850
Mouse	Corsair Harpoon
Keyboard	Corsair K55
VR HMD	HP Reverb G2
Software	Windows 11 Professional
Benchmark Scores	PEBCAK

System Name	The Blind Grim Reaper
Processor	Xeon X5675 Westmere-EP B1 SLBYL 4.20ghz @ 1.256v
Motherboard	Asus P6X58D-E
Cooling	Noctua CP12 SE14, Redux Noctua 1500rpm fan Arctic F14 x3 for intake and exhaust
Memory	Corsair XMS3 CMX4GX3M2A1600C9 x6
Video Card(s)	EVGA GTX 1060 6GB SC Single Fan Model
Storage	Crucial mx300 750gb main system + 1TB mx500 for games and music
Display(s)	22 inch samsung curved
Case	NZXT Phantom 530 black
Audio Device(s)	Nvidia HDMI through HDMI adaptor for output sound for turtlebeach x12 headset
Power Supply	Antec HCG 850 watt
Mouse	no brand
Keyboard	normal usb keyboard
Software	Windows 10 22H2 v1 (main is) and Windows 11 22H2 v2 on WD 250gb 7200rpm (testing purposes os)
Benchmark Scores	Cinebench R20 = 2046cb

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

System Name	Main PC
Processor	13700k
Motherboard	Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling	Noctua NH-D15S
Memory	32 Gig 3200CL14
Video Card(s)	4080 RTX SUPER FE 16G
Storage	1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s)	LG 27GL850
Case	Fractal Define R4
Audio Device(s)	Soundblaster AE-9
Power Supply	Antec HCG 750 Gold
Software	Windows 10 21H2 LTSC

470-780 VBIOS mod utility: fix old nvidia artifacts/code 43 by disabling bad parts

StViolenceDay

Edwired

Toothless

Tech, Games, and TPU!

Edwired

2wick_10029

StViolenceDay

ribcage

New Member

StViolenceDay

eidairaman1

The Exiled Airman

chrcoluk

Macro Device

Edwired

chrcoluk

Macro Device

ribcage

New Member

xvnqa

New Member

lordmogul

totalmentemati

New Member

StViolenceDay

System Name	D.L.S.S. (Die Lekker Spoed Situasie)
Processor	i5-12400F
Motherboard	Gigabyte B760M DS3H
Cooling	Laminar RM1
Memory	32 GB DDR4-3200
Video Card(s)	RX 6700 XT (vandalised)
Storage	Yes.
Display(s)	MSi G2712
Case	Matrexx 55 (slightly vandalised)
Audio Device(s)	Yes.
Power Supply	Thermaltake 1000 W
Mouse	Don't disturb, cheese eating in progress...
Keyboard	Makes some noise. Probably onto something.
VR HMD	I live in real reality and don't need a virtual one.
Software	Windows 11 / 10 / 8
Benchmark Scores	My PC can run Crysis. Do I really need more than that?