• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA's GeForce RTX 5080 SUPER Gains 24 GB GDDR7, Keeps 10,752 CUDA Cores

remember the 4080 IMO, did not sell well. then the 4080 was changed to the 4080super and the price was changed to be to equal to the msrp of the 7900xtx.
s@me deal but this time the price will have to be seriously cut because the rx9070xt is good enough, and because nobody is buying the 5080, it will have to be sold at retail for to more than 200 off of the msrp of the 5080… not going to happen.
funny nobody knew about tariffs when the 5080 was released. but in Nvidiaput them on anyways, just the money goes to Nvidia not the government… lol.
 
remember the 4080 IMO, did not sell well. then the 4080 was changed to the 4080super and the price was changed to be to equal to the msrp of the 7900xtx.
s@me deal but this time the price will have to be seriously cut because the rx9070xt is good enough, and because nobody is buying the 5080, it will have to be sold at retail for to more than 200 off of the msrp of the 5080… not going to happen.
funny nobody knew about tariffs when the 5080 was released. but in Nvidiaput them on anyways, just the money goes to Nvidia not the government… lol.
NVIDIA and retailers profit the most... Retailers are having a blast right now
 
Lol, that sounds like a bloatware. I thought those kind of games are well optimized.
If the game detects excess, unused VRAM, it will allocate extra assets into that VRAM - This can be implemented into the game engine to improve continuous loading performance and reduced pop-in.

Works both ways - If there's not enough VRAM, the game engine can omit assets that need VRAM (mainly textures - look up Warhammer 40K Space Marine 2...) But most games will swap to system RAM instead.

It doesn't need 18 GB, but if it has it available, it will use it to reduce texture pop-in and load times, as well as keep textures cached for future use in other areas - making the experience smoother
 
Result: A 2025 survey found 68% of GPU buyers regretted purchases within 6 months, citing mismatched expectations vs. marketing hype
  • Declining performance gains
  • Generational stagnation
  • Fake innovation (e.g., "fake pixels," rebranded tech)
  • Artifact-ridden interpolated frames (overhyped)
  • Noisy, blurry, performance-crushing "ray tracing"
  • Diminishing returns on overclocking
  • Thermal and power draw challenges
  • Memory bandwidth bottlenecks
  • Architectural inefficiencies
  • Thermal throttling under real-world conditions
  • Component lottery (inconsistent silicon/memory quality within same SKU)
  • Exponential stratification of product lines
  • Two tiers of halo products as misleading reference points
  • Inflated prices and exorbitant pricing/scalping
  • Premium pricing for marginal gains
  • "Early adopter tax" mentality
  • Regional price discrepancies and availability issues
  • "Limited edition" hardware to drive FOMO sales
  • Mid-generation refreshes disguised as EOLs
  • Quiet removal of flaws in revised silent SKUs
  • Hallucinated benchmarks (unrealistic metrics)
  • Post-truth reviews (vendor-captured influencers)
  • Misleading marketing jargon (e.g., "AI-enhanced shadows")
  • Artificial benchmarks for press demos only
  • Hyperbolic naming schemes (e.g., "Super Ultra Mega Edition")
  • Unfulfilled performance promises
  • Beta-quality features launched as "next-gen" tech
  • Bloated control panels and software suites
  • Subscription-based software add-ons for hardware
  • Gatekeeping features behind paywalls or premium models
  • Feature gating via software/firmware
  • Day-one firmware updates to "unlock" promised specs
  • Proprietary connectors/coolers to kill modding/upgrading
  • Overreliance on driver-level hacks and fixes
  • Driver instability and inconsistent support
  • Delayed open-source driver releases
  • Minimal real-world gains in flagship products
  • Engineered scarcity
  • Planned obsolescence & redundancy
  • Disposable design (ignoring repairability/durability)
  • E-waste and greenwashing of unsustainable practices
  • Artificially constrained hardware memory
  • Upselling practices and buyer’s remorse
  • "Paper launch" frustrations (announced but unavailable)
  • Firmware/driver lock-ins
  • Pre-installed bloatware and spyware-like telemetry
  • Cross-generation incompatibility as a sales strategy
  • Regressive design choices (removing useful features)
  • Corporate greed over customer needs
  • Opaque supply chains (ethics, sourcing)
  • Vendor-captured "community feedback" loops
  • Faked ecosystem lock-in tactics
  • GPU mining-era legacy distorting pricing norms
  • Ignoring niche user needs for mass-market appeal
 
Last edited by a moderator:
Okay, I think most people are probably not gonna like this but...
If they pull this off and keep it the same price for 5080 I think it would be alright.

Okay, I think most people are probably not gonna like this but...
If they pull this off and keep it the same price for 5080 I think it would be alright.
Chances are, it would cost 1200+ though, so oh well.
 
Yeah it is! 16GB cards @ high resolution are outdated imho. I didn't know D4 demanded that much @ 1440p, but it sure feels good to know you're solid for a few more years.
It just allocates as much and it can. It runs perfect on 12GB cards at 1440p max settings.
 
It just allocates as much and it can. It runs perfect on 12GB cards at 1440p max settings.
You may be right @ 1440p, but at 4K w/ max settings, it was anything but perfect. My apologies, I don't consider 1440p as high resolution, in my humble opinion, I believe 1440p should be entry level. Its the perfect resolution for everything in today's games.


*Above statement is highly controversial - in 2024, I made that statement and folks all over the globe came for me.

Fellas ans ladies! Remember that there is rx 9070 xt <3.
A wonderful card indeed for 1440p and below. Unless you want the challenge of running limited settings @ 4K.
 
Unfortunately the 4090 is Memory Bandwidth starved... It has ~52.4% more CUDA Cores but is only ~20% faster than the 5080 ! Hence why the 5090 has a 512-bit bus but it is still very limited somehow.

Imo the 4090 & 5090 are both hugely held back because of L2 Cache, the 4090 only has 72MB (instead of 96MB for a full AD102) and the 5090 only has 96MB (instead of 128MB for a full GB202).

Nvidia are charging a Premium for those GPUs but do not even bother giving the full L2 Cache! It's very disappointing honestly.
The 5090 ends up being almost 80% faster when it comes to LLMs due to the extra memory bandwidth provided by that bigger bus + GDDR7 memory.
Blackwell was really thought as a compute GPU.

Unfortunately AMD seem to lack the expertise and knowledge of how to use GDDR7x and its 3GB modules, so 16GB is their limit.
I don't think that's the case, but rather that they did not want to spend up to increase their margins, nor face availability issues like Nvidia has faced.
They could also easily do a clamshell out of those memory modules, and in fact have done so with their Pro 9700 offering with its 32GB.
 
AI LLM self-hosting / inferencing:
GDDR7 makes it possible to have a 5080 (Super) having almost the same VRAM bandwidth as the 4090 and being cheaper and more can be produced from one waver because of the much smaller chip size:
4090: 384-bit, 608.5mm², 1008 GB/s
5080: 256-bit, 378 mm2, 960 GB/s
Because the 5080 (the whole 50 series) is using the same node and so has the same power efficiency, as the architecture hasn't changed either, it's slower at prompt processing than the 4090, but for me this is fine as long as the amount of input text (aka context size) isn't huge.

That being said, unfortunately 24GB VRAM are not enough to fully fit SOTA model Qwen3-32B-GGUF at 6-bit (26.9 GB) or 8-bit (34.8 GB), let alone with any decent/full context. Offloading to system RAM decreases the tokens / words per second speed massively.
Qwen3 are thinking models (thinking can be disabled, but then the performance decreases a lot) and generate many thinking tokens, which require bigger context (=more VRAM needed), so it's very important to be able to fully fit the LLM into the VRAM for maximum speed, without having the system RAM bottleneck. When using Qwen3-14B-GGUF, I observed that a minimum of approximately 10000 to 11000 context size is required for rather simple 1-3 sentence queries due to the thinking tokens, but the performance is impressive.

A 5090 (Ti) (Super) 48 GB VRAM using the 3 GB, instead of the current 2 GB, GDDR7 modules would perfectly fit it (32 GB VRAM / 2 GB * 3 GB).
 
Back
Top