• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

AMD Radeon RX 9070 XT Could Get a 32 GB GDDR6 Upgrade

With the 5090, NV has released a 32GB VRAM consumer GPU, ofc AMD is going to do the same (wasn't it the same with 24GB VRAM consumer GPUs?). The difference is the 9070 (XT) is based on a 256-bit chip using GDDR6 ~600 GB/s vs 512-bit GDDR7 1792 GB/s for the 5090. Still fast enough.

AFAIK, only modded games may require more than 24GB VRAM in 4K right now, but 32GB are nice for fully offloading/hosting big-ish LLMs locally.

Regarding CUDA/ML stack, indeed, I think of AMD GPUs only in terms of running/inferencing LLMs, not training/finetuning, but I read it's still possible and supposedly got easier over the last years, but CUDA is tier agnostic and supports consumer GPUs, workstation GPUs and enterprise cards. To improve this, UDNA (U for unified) will replace RDNA at some point.

5090' idle power consumption unfortunately increased to 30W (4090 22W), but it's still not too bad (it's more than linear in video playback: 54W 5090 vs 26W 4090) considering there are 16 2GB modules (linear increase: 22W[4090]/12[GDDR6X]*16[GDDR7] = 29.33W).

For me to consider this RDNA4 32GB GPU (in no particular order):
  • DLSS 2-like upscaling quality improvement
  • Fix HDMI 2.1 48GB/s, aka HDMI 2.1a on Linux
  • Back to good power scaling like in RDNA2
  • Low idle power consumption, linear increase with the amount of VRAM compared to the 16GB VRAM, at the worst
  • Just like the 5090, 9070 (XT) 32GB also must be a consumer GPU, so that the price increase is minimal
So, AMD, it's 48GB VRAM consumer GPUs for the UDNA arch after RDNA4 then as well? Would allow to fully offload `Llama-3.3-70B-Instruct-Q4_K_M.gguf` (42.5GB) (by then we will have a different and more capable 70B LLM, ofc), or allow for much higher context.


Yes, 24GB VRAM can't fit a e.g 27GB `Qwen2.5-32B-Instruct-Q6_K.gguf` SOTA LLM, but the .gguf format allows to offload the rest of the LLM layers to RAM, but it will run much slower. The tokens per second speed increases exponentially the more layers are offloaded to the GPU, I did some testing:
View attachment 384684
I'm not sure they will do 48GB on consumer just yet as that will give the W7800 and W7900 workstation gpu's some competition but we shall see.

Right now this model gives me the best performance on a 24GB VRAM gpu

1739465206870.png


Doing about 28 tok / sec

1739465347775.png


Looks like this rumor is false.

1739476291342.png
 
Last edited:
AFAIK, only modded games may require more than 24GB VRAM in 4K right now, but 32GB are nice for fully offloading/hosting big-ish LLMs locally.

Regarding CUDA/ML stack, indeed, I think of AMD GPUs only in terms of running/inferencing LLMs, not training/finetuning, but I read it's still possible and supposedly got easier over the last years, but CUDA is tier agnostic and supports consumer GPUs, workstation GPUs and enterprise cards. To improve this, UDNA (U for unified) will replace RDNA at some point.

5090' idle power consumption unfortunately increased to 30W (4090 22W), but it's still not too bad (it's more than linear in video playback: 54W 5090 vs 26W 4090) considering there are 16 2GB modules (linear increase: 22W[4090]/12[GDDR6X]*16[GDDR7] = 29.33W).

For me to consider this RDNA4 32GB GPU (in no particular order):
  • DLSS 2-like upscaling quality improvement
  • Fix HDMI 2.1 48GB/s, aka HDMI 2.1a on Linux
  • Back to good power scaling like in RDNA2
  • Low idle power consumption, linear increase with the amount of VRAM compared to the 16GB VRAM, at the worst
  • Just like the 5090, 9070 (XT) 32GB also must be a consumer GPU, so that the price increase is minimal
So, AMD, it's 48GB VRAM consumer GPUs for the UDNA arch after RDNA4 then as well? Would allow to fully offload `Llama-3.3-70B-Instruct-Q4_K_M.gguf` (42.5GB) (by then we will have a different and more capable 70B LLM, ofc), or allow for much higher context.


Yes, 24GB VRAM can't fit a e.g 27GB `Qwen2.5-32B-Instruct-Q6_K.gguf` SOTA LLM, but the .gguf format allows to offload the rest of the LLM layers to RAM, but it will run much slower. The tokens per second speed increases exponentially the more layers are offloaded to the GPU, I did some testing:
Yeah spot on.
Regarding VRAM and games: I think its hard to get games to use above 16GB but the one exception similar to what you said with mods, is VRChat. This game is quite unusual in that you're seeing user uploaded unity assets and as a result the optimization is horrible, a tragedy of the commons situation in which not enough individuals optimize their assets so each avatar could be as bad as 500MB of vram... so if you go to a populated room, there is no real upper bound for how much VRAM you'd like! And, as that's a VR game, displayport 2 is a must for future proof because current-gen VR headsets already saturate what DP 1.4 can do.

Regarding VRAM and AI: If you train a LoRA for SD XL, you'll already cross the 16GB boundary. SD 3 and Flux are going to be worse. Training speed isn't really an issue here, just vram.
Inference as you say, doesn't need hardly as much vram, so the 9070xt will shine with even less VRAM.

But as a competitor to say a 5080, the 5080 just doesnt have enough vram. I think 24GB is enough, 32 is a bonus, but 16 just isn't enough for these admittedly obscure tasks.
 
not sure that cards need it, it's mostly to serve the AI crowd, grab more money, less gpus available for gamers
 
Smart move if they manage to offer that 32GB iteration as a workstation GPU in order to lighten gamers' GPUs series demand from whoever (non-gamer) needs more RAM for apps. And they will be able to sell them for higher profit margins that should allow AMD to keep the gamers' GPUs in normal pricing.
 
Look at the update, guys. There's no 32 GB gamer card, but maybe a Radeon Pro workstation card coming later.
 
not sure that cards need it, it's mostly to serve the AI crowd, grab more money, less gpus available for gamers
It won't as it's not going to be a 4K card.

The 4080 isn't for 4K either, more like 1440p240.
 
It won't as it's not going to be a 4K card.

The 4080 isn't for 4K either, more like 1440p240.
I agree at 4k you really want a 4090 or a 5090.

the 7900XTX / 4080 tier is better at 1440 or 1440 UW resolutions. 4K 60 is doable at this tier but for me personally I prefer to have my fps up in the 100-144 range without the use of upscaling or fg.
 
I like the 32GB version and would like GPU water cooling support with a single PCI lane. I would have no issue going for this over the Nvidia FE.
 
unless you play at 4k native + ultra settings + RT + AA

which can and will use over 16GB easily in many games
Will be interesting to see if a 9070Xt can even do that at acceptable framerates.

In any case, looks like the rumor was false.
 
It won't as it's not going to be a 4K card.

The 4080 isn't for 4K either, more like 1440p240.

no one is going to game on that thing
 
News article said:
Update 20:55 UTC: AMD's Frank Azor on X debunked rumors of the 32 GB SKU coming to gamers. So, this will not happen. Instead, we could be looking at prosumer oriented AMD Radeon Pro GPU with 32 GB of memory instead.
Rumor false? You mean false until it's out in 6-12 months? AMD want ppl to buy their much more expensive prosumer SKUs when NV is offering 32GB in the consumer space? AMD never misses an opportunity to miss an opportunity.
 
This is not what is written, in fact it clearly says no 9070XT 32Gb which does not state not 9070 XTX 32Gb or 9075XT 32Gb or any other naming. It is just corpo speech.....

No higher SKU announced, though. Not even a faint rumor from CN forums, and AMD themselves has already admitted nothing above Navi 48 was developed, with the 9070 XT having the full configuration already. But I'll let you hit the hopium as much as you want :D

Rumor false? You mean false until it's out in 6-12 months? AMD want ppl to buy their much more expensive prosumer SKUs when NV is offering 32GB in the consumer space? AMD never misses an opportunity to miss an opportunity.

AMD has done prosumer once: Vega Frontier, and it was a complete disaster. You'll never guess who once had one :rolleyes:

But not really, not in this case. This GPU just doesn't have the performance chops for 32 GB at anything, and LLMs would only run faster because they are almost always VRAM capacity bottlenecked.
 
No higher SKU announced, though. Not even a faint rumor from CN forums, and AMD themselves has already admitted nothing above Navi 48 was developed, with the 9070 XT having the full configuration already. But I'll let you hit the hopium as much as you want :D



AMD has done prosumer once: Vega Frontier, and it was a complete disaster. You'll never guess who once had one :rolleyes:

But not really, not in this case. This GPU just doesn't have the performance chops for 32 GB at anything, and LLMs would only run faster because they are almost always VRAM capacity bottlenecked.
Probably 9070 WS though :)

 
At least AMD is officially responding to the rumors. A rumor mill left untamed will run wild.

Of course the best way to deal with all this is to RELEASE THE DAMN CARDS ALREADY!
 
Yeah, Radeon Pro version will 100% have 32 GB, that much is expected. I also expect NV should launch an RTX 6000 Blackwell Generation with 64 gigs too
I know they announced a 96GB version of Blackwell for workstation cards so a 64GB model probably coming aswell.

 
I know they announced a 96GB version of Blackwell for workstation cards so a 64GB model probably coming aswell.


Ooo, 3 GB G7 chips are ready. The G6 ones I guess will not release after all, or maybe might see some limited use for AMD that's still using the old standard in their cards?
 
David Macafee confirmed no 32gb 9070XT. Just seen it.
 
Ooo, 3 GB G7 chips are ready. The G6 ones I guess will not release after all, or maybe might see some limited use for AMD that's still using the old standard in their cards?
Other than going HBM way back AMD generally doesn't go with expensive top end memory so they will probably stay with current memory will have to wait and see what they do with the UDNA generation.
 
[...]
For me to consider this RDNA4 32GB GPU (in no particular order):
  • DLSS 2-like upscaling quality improvement
  • Fix HDMI 2.1 48GB/s, aka HDMI 2.1a on Linux
  • Back to good power scaling like in RDNA2
  • Low idle power consumption, linear increase with the amount of VRAM compared to the 16GB VRAM, at the worst
  • Just like the 5090, 9070 (XT) 32GB also must be a consumer GPU, so that the price increase is minimal
So, AMD, it's 48GB VRAM consumer GPUs for the UDNA arch after RDNA4 then as well? Would allow to fully offload `Llama-3.3-70B-Instruct-Q4_K_M.gguf` (42.5GB) (by then we will have a different and more capable 70B LLM, ofc), or allow for much higher context.
[...]
I must add that when AMD releases a 48GB VRAM consumer GPU (the time of release date may be accelerated due to AI / LLM selfhosting being a thing now, but in 2-3 years/next gen at the earliest), I'd like them to use GDDR7 at that point, otherwise the speeds may be too slow (the +30% speed increase using GDDR7 is worth it and by that time GDDR7 should be cheaper too).

Oh, please don't tell me, I am painfully aware how slow things can get when you run LLM-s outside of a GPU :D
May I ask you to post here?
I thought so. I guess I don't want to create a new topic to post my benchmark result of RAM vs VRAM offloading speeds :)
I'm going to post there soon.

This is not what is written, in fact it clearly says no 9070XT 32Gb which does not state not 9070 XTX 32Gb or 9075XT 32Gb or any other naming. It is just corpo speech.....
Makes sense. A same name would only confuse endconsumers (but hiding Zen 2 in Zen 3 iCPU names does already that, to name just one example). Name dilemma: Same performance vs different VRAM amounts, especially from 16GB to 32GB, may not only deserve a special "AI" added to its name (they add "AI" in product names already), but maybe a (slightly) different name altogether. (though NV has a 4060Ti 8GB and 4060Ti 16GB, lets see if they do the same (confusion for endconsumers) for the GeForce 50 series)

I'm not sure they will do 48GB on consumer just yet as that will give the W7800 and W7900 workstation gpu's some competition but we shall see.

[...]
Indeed, but would be interesting to know what percentage of users buy the workstation GPUs solely for their VRAM amount vs the requirement of the workstation GPU's features. I don't expect a RDNA4 48GB consumer GPU ever, but with all the AI / LLM selfhosting being a thing now (it kinda just started and many more people may demand cheap, higher VRAM consumer GPUs), maybe next generation / in 2-3 years.
 
Will be interesting to see if a 9070Xt can even do that at acceptable framerates.

In any case, looks like the rumor was false.
my 7900XT can depending on the game at 60fps, 9070xt is stronger so it will as well

also, a bit of future proofing in regards to vram doesn't hurt anyone
 
Back
Top