B580 vs RX 7600 vs RTX 4060 in Pytorch/Tensorflow (AI) benchmarks?

Tia · Jan 1, 2025

Does anyone has benchmarks comparisons how well the these 3 cards perform on various models?

eidairaman1 · Jan 1, 2025

Reasoning to ask about this in particular?

Tia · Jan 1, 2025

Need to replace my Geforce 1060 3GB, with something that isn't almost a decade old. Because I regularly run into the VRAM limit.

But I can't decide.
Team green: Good driver performance, cuda, most AI models work out of the box, but less than ideal Linux support for gaming (Wayland had been troublesome) and I don't like their market dominance.
Team Red: Opensource Linux drivers (better Wayland support), but worse than team green in terms of performance.
Team Blue: New competition that I'd like to support. Tensor cores and ray tracing seems better compared competition around the same price point, but graphical performance is a lot worse on Linux compared to Windows at the moment.

Those are just a bunch of things on top of my head.

I kinda need a tie breaker and since I'm planning to use Pytorch & other ML frameworks like Burn, I decided to make the call on which performs best.

Lost_Troll · Jan 1, 2025

Tia said:
I kinda need a tie breaker and since I'm planning to use Pytorch & other ML frameworks like Burn, I decided to make the call on which performs best.

Nvidia will be the best for performance and highest cost with some tinkering needed with the Linux driver. AMD will be slower and lower cost without the tinkering needed with the Linux driver. Intel is the new kid on the block, and I would wait to see if the performance and Linux drivers are better than AMD, so I would wait until they are proven.

Most of the AI platforms do have benchmarks with different hardware configurations, so I would go over them and then make a decision based on cost and driver tinkering. I would also look at the size of the models you want to work with and see how much VRAM you are going to need to run them, since more VRAM will add additional cost.

eidairaman1 · Jan 1, 2025

Roll a dice

igormp · Jan 1, 2025

What exact models are you planning to run? Just general stuff?

Anyhow, the 4060 should be a bit faster since Nvidia's tensor cores are way faster than RDNA's WMMA instructions, but not by a larger margin since both have similar memory bandwidth and the 7600 actually has more compute hardware. All in all they should be trading blows more often than not.

For the B580, I haven't seen many tests yet. For LLM stuff I've only seen people using the vulkan backend instead of Intel's IPEX, which is way slower. If it's of any use to you, it's faster than an A770, but still slower than a 4060 using this backend.

Btw, why not go for a 3060? 12GB of vram, with faster memory than your listed options, and should be cheaper than all of those without any issues framework-wise.
On the other hand, if you could pony up a bit more cash, a 4060Ti 16gb would be a great pick.

Tia said:
Team green: Good driver performance, cuda, most AI models work out of the box, but less than ideal Linux support for gaming (Wayland had been troublesome) and I don't like their market dominance.

FWIW their wayland support has come a long way and it's way better now. Although that's not from my own experience (I'm still on Xorg), but from other acquaintances.

Lost_Troll said:
Nvidia will be the best for performance and highest cost with some tinkering needed with the Linux driver. AMD will be slower and lower cost without the tinkering needed with the Linux driver. Intel is the new kid on the block, and I would wait to see if the performance and Linux drivers are better than AMD, so I would wait until they are proven.

I would say there's no tinkering with nvidia's driver whatsoever. Just install the open source modules with your package manager and be done.
For AMD you'd have a lot more headaches with ROCm when compared to CUDA.
Intel seems to be a bit easier than ROCm, but not as easy as CUDA. As mentioned by OP, its performance in games on linux is worse than on windows, but for compute it seems to be okay-ish.

bug · Jan 1, 2025

Tia said:
Does anyone has benchmarks comparisons how well the these 3 cards perform on various models?

Easy: 4060 wins because CUDA.

Tia · Jan 2, 2025

Benchmarks, anyone?

igormp · Jan 3, 2025

Tia said:
Benchmarks, anyone?

You still did not specify what kind of models you're planning on running.
As an example, for something like SD you can take a look at here:

SD WebUI Benchmark Data

SD WebUI Benchmark Data; Author: Vladimir Mandic <https://github.com/vladmandic>

vladmandic.github.io

Finding results from LLMs should be doable by a quick glance at reddit.

bug · Jan 3, 2025

Tia said:
Benchmarks, anyone?

What kind of benchmarks do you need? If it runs on CUDA, it gets 0 TFLOPS on non-Nvidia hardware.

Steevo · Jan 3, 2025

bug said:
What kind of benchmarks do you need? If it runs on CUDA, it gets 0 TFLOPS on non-Nvidia hardware.

AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics

AMD has quietly funded an effort over the past two years to enable binary compatibility for NVIDIA CUDA applications on their ROCm stack. This allows CUDA software to run on AMD Radeon GPUs without adapting the source code. The project responsible is ZLUDA, which was initially developed to...

www.techpowerup.com

GitHub - vosen/ZLUDA: CUDA on non-NVIDIA GPUs

CUDA on non-NVIDIA GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.

github.com

There are certainly things in CUDA that will not run on competitive hardware, but a lot will, I'm working on trying F@H through this and see what it does to performance.

izy · Jan 3, 2025

I think best cheap for ai would be 4060 ti 16gb because of the vram but be sure your mb has pcie 4.0 , cheaper i would go for 3060 12 gb or some 3080 12 gb, you can get away with 8 gb for some task but you are going to be vram limited most of the time so i wouldnt go with less than 12 gb, i dont have any experience with amd or intel for AI workloads

bug · Jan 3, 2025

Steevo said:
AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics

AMD has quietly funded an effort over the past two years to enable binary compatibility for NVIDIA CUDA applications on their ROCm stack. This allows CUDA software to run on AMD Radeon GPUs without adapting the source code. The project responsible is ZLUDA, which was initially developed to...

www.techpowerup.com

GitHub - vosen/ZLUDA: CUDA on non-NVIDIA GPUs

CUDA on non-NVIDIA GPUs. Contribute to vosen/ZLUDA development by creating an account on GitHub.

github.com

There are certainly things in CUDA that will not run on competitive hardware, but a lot will, I'm working on trying F@H through this and see what it does to performance.

ZLUDA - ZLUDA's third life

Where we are now:

The code has been rolled back to the pre-AMD state and I've been working furiously on improving the codebase. I’ve been writing the improved PTX parser I always wanted and laid the groundwork for the rebuild. Currently, some very simple synthetic GPU test programs can be run on an AMD GPU, but we are not yet at the point where ZLUDA can support a full application.

Macro Device · Jan 3, 2025

izy said:
4060 ti 16gb

These are so horribly overpriced everywhere so it's almost easier to get a used 3080 Ti 20 GB and give zero damn about limitations. Also get all the gaming edge one could possibly need today. If go AMD, I'd never in my right mind pick anything lower than a 7800 XT for this task because these run outta VRAM much faster than NVIDIA GPUs that have the same amount thereof.

Intel GPUs will be a very major PITA for the OP. Almost nothing works as it should. Wanna be an alpha tester, go ahead, I'm not your mum but let me warn you, it will be one hell of a ride.

igormp · Jan 3, 2025

Macro Device said:
3080 Ti 20 GB

Those are really rare, chances are that an used 3090 is cheaper than the above.
A 4060 ti where I live is also priced reasonably, just a tad more expensive than the regular 4060, but pricing on used goods here isn't that great anyway.

Macro Device · Jan 3, 2025

igormp said:
chances are that an used 3090 is cheaper than the above.

Never seen that. And I monitor aftermarket on a weekly basis. It's either cheaper than 3090 or doesn't exist.

izy · Jan 3, 2025

Macro Device said:
These are so horribly overpriced everywhere so it's almost easier to get a used 3080 Ti 20 GB and give zero damn about limitations. Also get all the gaming edge one could possibly need today. If go AMD, I'd never in my right mind pick anything lower than a 7800 XT for this task because these run outta VRAM much faster than NVIDIA GPUs that have the same amount thereof.

Intel GPUs will be a very major PITA for the OP. Almost nothing works as it should. Wanna be an alpha tester, go ahead, I'm not your mum but let me warn you, it will be one hell of a ride.

If he wants a brand new gpu i dont see a better option than 4060 ti 16 gb, i dont like it myself but its the only option with 16gb and it has decent power consumption too, 3080 ti 20gb would for sure be a better option but i have no idea about its price and you need a good psu.. and also for AI stuff i like lower power consumption.
I am doing some work atm with whisper ai and resemble-enhance and they can fit in 8gb if using the turbo version, a 12 gb gpu can be enough for lite AI but yeah, you will be limited in many other AI workloads.

Frick · Jan 3, 2025

Macro Device said:
Never seen that. And I monitor aftermarket on a weekly basis. It's either cheaper than 3090 or doesn't exist.

Those markets are/can be vastly different.

izy · Jan 3, 2025

At my place you can find gadgets/ computer components at 65 - 70% of the market price new in unopened box without warranty or little used in gaming with 2 year warranty left, i used to buy new in box without warranty and never had a problem but its a risk im willing to take

W1zzard · Jan 3, 2025

Tia said:
I decided to make the call on which performs best.

Is "best" just highest compute performance? How about your developer time and skills? Unless you are an ML expert I strongly suggest you go CUDA, even if the hardware costs more. You will have working code in record time and you won't encounter unknown/random bugs and issues.

bug · Jan 3, 2025

Looks like the OP may get their wish (sort of) granted: https://lambdalabs.com/gpu-benchmar...i1VOnM6LkhJIL4l17eIRRt3rpjtsAk3_HvbBV2JqmqoFT
Only Nvidia benchmarked atm (of course), but it seems others may be incoming.

igormp · Jan 3, 2025

bug said:
Looks like the OP may get their wish (sort of) granted: https://lambdalabs.com/gpu-benchmar...i1VOnM6LkhJIL4l17eIRRt3rpjtsAk3_HvbBV2JqmqoFT
Only Nvidia benchmarked atm (of course), but it seems others may be incoming.

those benchmarks are somewhat old, and lambdalabs doesn't have offerings with AMD or Intel GPUs.
RunPod is more likely to have those, but they won't be using entry-level GPUs.

Tia · Jan 3, 2025

bug said:
What kind of benchmarks do you need? If it runs on CUDA, it gets 0 TFLOPS on non-Nvidia hardware.

Different kinds, to be less specific; I am looking for a geometric mean across different models, but to keep it simple the average tokens/s for the following two models:

meta-llama/Llama-3.1-8B-Instruct at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Qwen/Qwen2-7B-Instruct at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

W1zzard said:
Is "best" just highest compute performance? How about your developer time and skills? Unless you are an ML expert I strongly suggest you go CUDA, even if the hardware costs more. You will have working code in record time and you won't encounter unknown/random bugs and issues.

Certainly no expert, but I grasp the fundamental statistics behind what AI really is and have a tendency to do premature optimization and for what it's worth, I also have experience with Vulkan and its various extensions (I am not a stranger to verbose/low level APIs).

"developer time and skills" From the way you're phrasing it sounds like there are major hurdles from using non-nvidia hardware with AI libraries.

I need to put into perspective how big these hurdles are, as I do have experience with a tinkering, so it might not be a big problem for me. Per chance you have some examples of hurdles? I am asking as I am on a tight budget.

bug · Jan 4, 2025

@Tia Try implementing something relatively simple using CUDA, ROCm and OpenCL. You'll see what hurdles @W1zzard is talking about.
Seriously, do it. Especially on a tight budget, it's better to measure twice and cut once

izy · Jan 4, 2025

You can make some ideas reading this:

https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

System Name	Spam
Processor	i9-12900K PL1=125 TA=56 PL2=288
Motherboard	MSI MAG B660M Mortar WiFi DDR4
Cooling	Scythe Kaze Flex 120mm ARGB Fans x1 / Alphacool Eisbaer 360
Memory	Mushkin Red Line DDR4 4000 16Gb x2 18-22-22-42 1T
Video Card(s)	Sapphire Pulse RX 7900 XT
Storage	Team Group MP33 512Mb / 1Tb
Display(s)	SAMSUNG Odyssey G50A (LS27AG500PNXZA) (2560x1440)
Case	Lan-Li A3
Audio Device(s)	Real Tek on Board Audio
Power Supply	EVGA SuperNOVA 850 GM
Mouse	M910-K
Keyboard	K636CLO
Software	WIN 11 Pro

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

Processor	9950x \| 5950x
Motherboard	x670e ProArt\| B550 ProArt
Cooling	PA 120 SE \|Fuma 2
Memory	4x64GB Kingston CUDIMM @5200MHz \| 4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	Corsair RM1000e \| XPG Core Reactor 850W
Software	I use Arch btw

Processor	Intel i5-12600k
Motherboard	Asus H670 TUF
Cooling	Arctic Freezer 34
Memory	2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s)	EVGA GTX 1060 SC
Storage	500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s)	Dell U3219Q + HP ZR24w
Case	Raijintek Thetis
Audio Device(s)	Audioquest Dragonfly Red :D
Power Supply	Seasonic 620W M12
Mouse	Logitech G502 Proteus Core
Keyboard	G.Skill KM780R
Software	Arch Linux + Win10

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	D.L.S.S. (Die Lekker Spoed Situasie)
Processor	i5-12400F
Motherboard	Gigabyte B760M DS3H
Cooling	Laminar RM1
Memory	32 GB DDR4-3200
Video Card(s)	RX 6700 XT (vandalised)
Storage	Yes.
Display(s)	MSi G2712
Case	Matrexx 55 (slightly vandalised)
Audio Device(s)	Yes.
Power Supply	Thermaltake 1000 W
Mouse	Don't disturb, cheese eating in progress...
Keyboard	Makes some noise. Probably onto something.
VR HMD	I live in real reality and don't need a virtual one.
Software	Windows 11 / 10 / 8
Benchmark Scores	My PC can run Crysis. Do I really need more than that?

System Name	Black MC in Tokyo
Processor	Ryzen 5 7600
Motherboard	MSI X670E Gaming Plus Wifi
Cooling	Be Quiet! Pure Rock 2
Memory	2 x 16GB Corsair Vengeance @ 6000Mhz
Video Card(s)	XFX 6950XT Speedster MERC 319
Storage	Kingston KC3000 1TB \| WD Black SN750 2TB \|WD Blue 1TB x 2 \| Toshiba P300 2TB \| Seagate Expansion 8TB
Display(s)	Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case	Fractal Design Define R4
Audio Device(s)	AuraSound AS42 Soundbar \| Plantronics 5220 \| Sony WH-1000XM3 \| Nektar SE61 \| Behringer XR18
Power Supply	Corsair RM850x v3
Mouse	Logitech G602
Keyboard	Dell SK3205
Software	Windows 10 Pro
Benchmark Scores	Rimworld 4K ready!

Processor	Ryzen 7 5700X
Memory	48 GB
Video Card(s)	RTX 4080
Storage	2x HDD RAID 1, 3x M.2 NVMe
Display(s)	30" 2560x1600 + 19" 1280x1024
Software	Windows 10 64-bit

B580 vs RX 7600 vs RTX 4060 in Pytorch/Tensorflow (AI) benchmarks?

New Member

The Exiled Airman

New Member

The Exiled Airman

New Member

Where we are now:​

Fishfaced Nincompoop

Administrator

New Member

Where we are now: