Wednesday, February 28th 2024

AMD Readying Feature-enriched ROCm 6.1

Feb 28th, 2024 13:47 Discuss (7 Comments)

The latest version of AMD's open-source GPU compute stack, ROCm, is due for launch soon according to a Phoronix article—chief author, Michael Larabel, has been poring over Team Red's public GitHub repositories over the past couple of days. AMD ROCm version 6.0 was released last December—bringing official support for the AMD Instinct MI300A/MI300X, alongside PyTorch improvements, expanded AI libraries, and many other upgrades and optimizations. The v6.0 milestone placed Team Red in a more competitive position next to NVIDIA's very mature CUDA software layer. A mid-February 2024 update added support for Radeon PRO W7800 and RX 7900 GRE GPUs, as well as ONNX Runtime.

Larabel believes that "ROCm 6.1" is in for an imminent release, given his tracking of increased activity on publicly visible developer platforms: "For MIPOpen 3.1 with ROCm 6.1 there's been many additions including new solvers, an AI-based parameter prediction model for the conv_hip_igemm_group_fwd_xdlops solver, numerous fixes, and other updates. AMD MIGraphX will see an important update with ROCm 6.1. For the next ROCm release, MIGraphX 2.9 brings FP8 support, support for more operators, documentation examples for Whisper / Llama-2 / Stable Diffusion 2.1, new ONNX examples, BLAS auto-tuning for GEMMs, and initial code for MIGraphX running on Microsoft Windows." The change-logs/documentation updates also point to several HIPIFY for ROCm 6.1 improvements—including the addition of CUDA 12.3.2 support.

Sources: Phoronix, Wccftech

Add your own comment

7 Comments on AMD Readying Feature-enriched ROCm 6.1

ecomorph

As someone with an AMD GPU (7800XT) that has recently dabbled in stable diffusion, if you care about AI, do yourself a favor get an nvidia. Most of the software is developed/tested with nvidia cards and if you go the AMD route, you're going to be spending hours upon hours troubleshooting and researching arguments to add to have basic functionality run, let alone run fast. For example, say you want to run stable-diffusion-webui, you can't use AUTOMATIC1111, you have to use a fork that uses directml. You want to use ROCm? Well, not all AMD cards are supported. You want to use onnx to optimize the models and have them run faster? Good luck debugging the string of errors you'll get from it. AMD is also not great at offering guides for how to run things properly and when they do, they're quickly outdated and you have to rely on no-name youtubers to guide you through it, like for example:

ZoneDymo

ecomorphAs someone with an AMD GPU (7800XT) that has recently dabbled in stable diffusion, if you care about AI, do yourself a favor get an nvidia. Most of the software is developed/tested with nvidia cards and if you go the AMD route, you're going to be spending hours upon hours troubleshooting and researching arguments to add to have basic functionality run, let alone run fast. For example, say you want to run stable-diffusion-webui, you can't use AUTOMATIC1111, you have to use a fork that uses directml. You want to use ROCm? Well, not all AMD cards are supported. You want to use onnx to optimize the models and have them run faster? Good luck debugging the string of errors you'll get from it. AMD is also not great at offering guides for how to run things properly and when they do, they're quickly outdated and you have to rely on no-name youtubers to guide you through it, like for example:

Ok and on Nvidia's side "it just works" ? or got no experience with that yet?

I know from Wendel from Level 1 Tech that AMD is pretty solid for stable diffusion, hell according to him it was even a tad more accurate (whatever that means).

ecomorph

ZoneDymoOk and on Nvidia's side "it just works" ? or got no experience with that yet?

I know from Wendel from Level 1 Tech that AMD is pretty solid for stable diffusion, hell according to him it was even a tad more accurate (whatever that means).

At least for stable diffusion webui, it does. I tested with a 3060 12GB and got 10it/s, while with 7800xt I get around 4it/s. Substantially less fiddling with args to get things working (first thing you get with AMD is 'your GPU doesn't have CUDA, use this flag to only use the CPU'). You can compare benchmark data here: vladmandic.github.io/sd-extension-system-info/pages/benchmark.html

ZoneDymo

ecomorphAt least for stable diffusion webui, it does. I tested with a 3060 12GB and got 10it/s, while with 7800xt I get around 4it/s. Substantially less fiddling with args to get things working (first thing you get with AMD is 'your GPU doesn't have CUDA, use this flag to only use the CPU'). You can compare benchmark data here: vladmandic.github.io/sd-extension-system-info/pages/benchmark.html

thanks but sadly I cant make heads or tails from that benchmark comparison, the data per row seems all over the place.

Firedrops

ecomorphAt least for stable diffusion webui, it does. I tested with a 3060 12GB and got 10it/s, while with 7800xt I get around 4it/s. Substantially less fiddling with args to get things working (first thing you get with AMD is 'your GPU doesn't have CUDA, use this flag to only use the CPU'). You can compare benchmark data here: vladmandic.github.io/sd-extension-system-info/pages/benchmark.html

Have 5700xt, 3060, and 2070 systems. Can confirm both SD and LLMs work super easily and fast in windows on Nvidia, but AMD is terrible. Every few months there'll be updates that brick AMD systems and you need to wait weeks for one of the ~4 devs with AMD to release a fix, and then do full re-installs. Even with all the manual optimizations for AMD, it's about 3 orders of magnitudes slower than nvidia.

progste

All this focus on AI is honestly silly but it's nice to see more Linux support!

Minus Infinity

FiredropsHave 5700xt, 3060, and 2070 systems. Can confirm both SD and LLMs work super easily and fast in windows on Nvidia, but AMD is terrible. Every few months there'll be updates that brick AMD systems and you need to wait weeks for one of the ~4 devs with AMD to release a fix, and then do full re-installs. Even with all the manual optimizations for AMD, it's about 3 orders of magnitudes slower than nvidia.

Alas PC users are plebs to AMD, they don't really care. I own all AMD stuff, but I'm not doing any mission critical work anymore. If I were still working I'd be using quadro for my workstation.

AMD Readying Feature-enriched ROCm 6.1

7 Comments on AMD Readying Feature-enriched ROCm 6.1

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

AMD Readying Feature-enriched ROCm 6.1

Related News

7 Comments on AMD Readying Feature-enriched ROCm 6.1

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts