Thursday, May 18th 2023

NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

Press Release by

May 18th, 2023 11:08 Discuss (139 Comments)

NVIDIA receives a lot of questions about graphics memory, also known as the frame buffer, video memory, or "VRAM", and so with the unveiling of our new GeForce RTX 4060 Family of graphics cards we wanted to share some insights, so gamers can make the best buying decisions for their gaming needs. What Is VRAM? VRAM is high speed memory located on your graphics card.

It's one component of a larger memory subsystem that helps make sure your GPU has access to the data it needs to smoothly process and display images. In this article, we'll describe memory subsystem innovations in our latest generation Ada Lovelace GPU architecture, as well as how the speed and size of GPU cache and VRAM impacts performance and the gameplay experience.

GeForce RTX 40 Series Graphics Cards Memory Subsystem: Improving Performance & Efficiency
Modern games are graphical showcases, and their install sizes can now exceed 100 GB. Accessing this massive amount of data happens at different speeds, determined by the specifications of the GPU, and to some extent your system's other components. On GeForce RTX 40 Series graphics cards, new innovations accelerate the process for smooth gaming and faster frame rates, helping you avoid texture stream-in or other hiccups.

The Importance Of Cache
GPUs include high-speed memory caches that are close to the GPU's processing cores, which store data that is likely to be needed. If the GPU can recall the data from the caches, rather than requesting it from the VRAM (further away) or system RAM (even further away), the data will be accessed and processed faster, increasing performance and gameplay fluidity, and reducing power consumption.

GeForce GPUs feature a Level 1 (L1) cache (the closest and fastest cache) in each Streaming Multiprocessor (SM), up to twelve of which can be found in each GeForce RTX 40 Series Graphics Processing Cluster (GPC). This is followed by a fast, larger, shared Level 2 (L2) cache that can be accessed quickly with minimal latency.

Accessing each cache level incurs a latency hit, with the tradeoff being greater capacity. When designing our GeForce RTX 40 Series GPUs, we found a singular, large L2 cache to be faster and more efficient than other alternatives, such as those featuring a small L2 cache, and a large, slower to access L3 cache.

Prior generation GeForce GPUs had much smaller L2 Caches, resulting in lower performance and efficiency compared to today's GeForce RTX 40 Series GPUs.

During use, the GPU first searches for data in the L1 data cache within the SM, and if the data is found in L1 there's no need to access the L2 data cache. If data is not found in L1, it's called a "cache miss", and the search continues into the L2 cache. If data is found in L2, that's called an L2 "cache hit" (see the "H" indicators in the above diagram), and data is provided to the L1 and then to the processing cores.

If data is not found in the L2 cache, an L2 "cache miss", the GPU now tries to obtain the data from the VRAM. You can see a number of L2 cache misses in the above diagram that depicts our prior architecture memory subsystem, which causes a number of VRAM accesses.

If the data's missing from the VRAM, the GPU requests it from your system's memory. If the data is not in system memory, it can typically be loaded into system memory from a storage device like an SSD or hard drive. The data is then copied into VRAM, L2, L1, and ultimately fed to the processing cores. Note that different hardware -and software- based strategies exist to keep the most useful, and most reused data present in caches.

Each additional data read or write operation through the memory hierarchy slows performance and uses more power, so by increasing our cache hit rate we increase frame rates and efficiency.

Compared to prior generation GPUs with a 128-bit memory interface, the memory subsystem of the new NVIDIA Ada Lovelace architecture increases the size of the L2 cache by 16X, greatly increasing the cache hit rate. In the examples above, representing 128-bit GPUs from Ada and prior generation architectures, the hit rate is much higher with Ada. In addition, the L2 cache bandwidth in Ada GPUs has been significantly increased versus prior GPUs. This allows more data to be transferred between the cores and the L2 cache as quickly as possible.

Shown in the diagram below, NVIDIA engineers tested the RTX 4060 Ti with its 32 MB L2 cache against a special test version of RTX 4060 Ti using only a 2 MB L2, which represents the L2 cache size of previous generation 128-bit GPUs (where 512 KB of L2 cache was tied to each 32-bit memory controller).

In testing with a variety of games and synthetic benchmarks, the 32 MB L2 cache reduced memory bus traffic by just over 50% on average compared to the performance of a 2 MB L2 cache. See the reduced VRAM accesses in the Ada Memory Subsystem diagram above.

This 50% traffic reduction allows the GPU to use its memory bandwidth 2X more efficiently. As a result, in this scenario, isolating for memory performance, an Ada GPU with 288 GB/sec of peak memory bandwidth would perform similarly to an Ampere GPU with 554 GB/sec of peak memory bandwidth. Across an array of games and synthetic tests, the greatly increased hit rates improve frame rates by up to 34%.

Memory Bus Width Is One Aspect Of A Memory Subsystem
Historically, memory bus width has been used as an important metric for determining the speed and performance class of a new GPU. However, the bus width by itself is not a sufficient indicator of memory subsystem performance. Instead, it's helpful to understand the broader memory subsystem design and its overall impact on gaming performance.

Due to the advances in the Ada architecture, including new RT and Tensor Cores, higher clock speeds, the new OFA Engine, and Ada's DLSS 3 capabilities, the GeForce RTX 4060 Ti is faster than the previous-generations, 256-bit GeForce RTX 3060 Ti and RTX 2060 SUPER graphics cards, all while using less power.

Altogether, the tech specs deliver a great 60-class GPU with high performance for 1080p gamers, who account for the majority of Steam users.

The Amount of VRAM Is Dependent On GPU Architecture
Gamers often wonder why a graphics card has a certain amount of VRAM. Current-generation GDDR6X and GDDR6 memory is supplied in densities of 8 GB (1 GB of data) and 16Gb (2 GB of data) per chip. Each chip uses two separate 16-bit channels to connect to a single 32-bit Ada memory controller. So a 128-bit GPU can support 4 memory chips, and a 384-bit GPU can support 12 chips (calculated as bus width divided by 32). Higher capacity chips cost more to make, so a balance is required to optimize prices.

On our new 128-bit memory bus GeForce RTX 4060 Ti GPUs, the 8 GB model uses four 16Gb GDDR6 memory chips, and the 16 GB model uses eight 16Gb chips. Mixing densities isn't possible, preventing the creation of a 12 GB model, for example. That's also why the GeForce RTX 4060 Ti has an option with more memory (16 GB) than the GeForce RTX 4070 Ti and 4070, which have 192-bit memory interfaces and therefore 12 GB of VRAM.

Our 60-class GPUs have been carefully crafted to deliver the optimum combination of performance, price, and power efficiency, which is why we chose a 128-bit memory interface. In short, higher capacity GPUs of the same bus width always have double the memory.

Do On Screen Display (OSD) Tools Report VRAM Usage Accurately?
Gamers often cite the "VRAM usage" metric in On Screen Display performance measurement tools. But this number isn't entirely accurate, as all games and game engines work differently. In the majority of cases, a game will allocate VRAM for itself, saying to your system, 'I want it in case I need it'. But just because it's holding the VRAM, doesn't mean it actually needs all of it. In fact, games will often request more memory if it's available.

Due to the way memory works, it's impossible to know precisely what's being actively used unless you're the game's developer with access to development tools. Some games offer a guide in the options menu, but even that isn't always accurate. The amount of VRAM that is actually needed will vary in real time depending on the scene and what the player is seeing.

Furthermore, the behavior of games can vary when VRAM is genuinely used to its max. In some, memory is purged causing a noticeable performance hitch while the current scene is reloaded into memory. In others, only select data will be loaded and unloaded, with no visible impact. And in some cases, new assets may load in slower as they're now being brought in from system RAM.

For gamers, playing is the only way to truly ascertain a game's behavior. In addition, gamers can look at "1% low" framerate measurements, which can help analyze the actual gaming experience. The 1% Low metric - found in the performance overlay and logs of the free NVIDIA FrameView app, as well as other popular measurement tools - measures the average of the slowest 1% of frames over a certain time period.

Automate Setting Selection With GeForce Experience & Download The Latest Patches
Recently, some new games have released patches to better manage memory usage, without hampering the visual quality. Make sure to get the latest patches for new launches, as they commonly fix bugs and optimize performance shortly after launch.

Additionally, GeForce Experience supports most new games, offering optimized settings for each supported GeForce GPU and VRAM configuration, giving gamers the best possible experience by balancing performance and image quality. If you're unfamiliar with game option lingo and just want to enjoy your games from the second you load them, GeForce Experience can automatically tune game settings for a great experience each time.

NVIDIA Technologies Can Help Developers Reduce VRAM Usage
Games are richer and more detailed than ever before, necessitating those 100 GB+ installs. To help developers optimize memory usage, NVIDIA has several free developer tools and SDKs, including:

NVIDIA RTX Memory Utility (RTXMU): Ray tracing requires additional VRAM. RTXMU can reduce this usage by up to 50%
NVIDIA Micro-Mesh SDK: Reduces the memory usage of complex geometry while also increasing performance
NVIDIA Texture Tools Exporter: Creates highly compressed texture files to reduce memory usage and the file size of games

These are just a few of the tools and technologies that NVIDIA freely provides to help developers optimize their games for all GPUs, platforms, and memory configurations.

Some Applications Can Use More VRAM
Beyond gaming, GeForce RTX graphics cards are used around the world for 3D animation, video editing, motion graphics, photography, graphic design, architectural visualization, STEM, broadcasting, and AI. Some of the applications used in these industries may benefit from additional VRAM. For example, when editing 4K or 8K timelines in Premiere, or crafting a massive architectural scene in D5 Render.

On the gaming side, high resolutions also generally require an increase in VRAM. Occasionally, a game may launch with an optional extra large texture pack and allocate more VRAM. And there are a handful of games which run best at the "High" preset on the 4060 Ti (8 GB), and maxed-out "Ultra" settings on the 4060 Ti (16 GB). In most games, both versions of the GeForce RTX 4060 Ti (8 GB and 16 GB) can play at max settings and will deliver the same performance.

The benefit of the PC platform is its openness, configurability and upgradability, which is why we're offering the two memory configurations for the GeForce RTX 4060 Ti; if you want that extra VRAM, it will be available in July.

A GPU For Every Gamer
Following the launch of the GeForce RTX 4060 Family, there'll be optimized graphics cards for each of the three major game resolutions. However you play, all GeForce RTX 40 Series GPUs will deliver a best-in-class experience, with leading power efficiency, supported by a massive range of game-enhancing technologies, including NVIDIA DLSS 3, NVIDIA Reflex, NVIDIA G-SYNC, NVIDIA Broadcast, and RTX Remix.

For the latest news about all the new games and apps that leverage the full capabilities of GeForce RTX graphics cards, stay tuned to GeForce.com.

Source: NVIDIA Blog

Add your own comment

139 Comments on NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

#76

Dr. Dro

yannus1They always fake bash. I'll always remember when they said that Nvidia didn't send them a sample to censor them while displaying loop advertisement of RTX. The same here, they say "oh no, it doesn't have enough VRAM" but always ended with a conclusion like " but they have a wonderful DLSS and RTX. This is a common technique of trying to seem opposing someone when in reality you're trying promote his interests.

...which, they do. NVIDIA offers wonderful, concise, well-supported features, and AMD often does not, or they are not good or popular enough to set the industry standard every time. There's no grand conspiracy here. In my opinion, Hardware Unboxed is a trustworthy source and they are generally unbiased, willing to point out strengths and weaknesses regardless of brand or product they are reviewing. Like they said on their 8 vs. 16 GB comparison video, AMD adding more VRAM to their cards isn't done out of kindness of their hearts, but because they had to pitch something to offer gamers.

It is true that their workstation segment is practically moribund (Radeon Pro is and has always been a bit of a mess, their support for most production applications is poor to non-existent especially if an app is designed with CUDA in mind - OpenCL sucks) and their high VRAM models offer 32 GB to those who need to work with extra large data sets, so giving an RX 6800/XT 16 GB isn't as big of deal to them as it is to Nvidia, who wants to ensure that their overpriced enterprise RTX A-series sell. This ensures that "hobbyist-level" creative professionals purchase at a minimum RTX 3090/3090 Ti or 4090 hardware, or supported professional models such as the RTX A4000 instead of a 3070/4070 and calling it a day.

#77

sLowEnd

Dr. Dro...which, they do. NVIDIA offers wonderful, concise, well-supported features, and AMD often does not, or they are not good or popular enough to set the industry standard every time. There's no grand conspiracy here. In my opinion, Hardware Unboxed is a trustworthy source and they are generally unbiased, willing to point out strengths and weaknesses regardless of brand or product they are reviewing. Like they said on their 8 vs. 16 GB comparison video, AMD adding more VRAM to their cards isn't done out of kindness of their hearts, but because they had to pitch something to offer gamers.

It is true that their workstation segment is practically moribund (Radeon Pro is and has always been a bit of a mess, their support for most production applications is poor to non-existent especially if an app is designed with CUDA in mind - OpenCL sucks) and their high VRAM models offer 32 GB to those who need to work with extra large data sets, so giving an RX 6800/XT 16 GB isn't as big of deal to them as it is to Nvidia, who wants to ensure that their overpriced enterprise RTX A-series sell. This ensures that "hobbyist-level" creative professionals purchase at a minimum RTX 3090/3090 Ti or 4090 hardware, or supported professional models such as the RTX A4000 instead of a 3070/4070 and calling it a day.

Nvidia's big success is in CUDA, but they've had some failed standards too. Gsync (in its original form with the monitor modules) isn't nearly as prevalent as Freesync. I haven't heard a peep about GPU-acclerated PhysX in years either.

#78

Dr. Dro

sLowEndNvidia's big success is in CUDA, but they've had some failed standards too. Gsync (in its original form with the monitor modules) isn't nearly as prevalent as Freesync. I haven't heard a peep about GPU-acclerated PhysX in years either.

The only reason FreeSync took off was cost. Hardware G-Sync is still technically the best, but the added cost and the fact monitors have been steadily improving and panels themselves handling ranges better, it just makes it a very unattractive proposition. This is true even today, see: Alienware AW3423DW (G-Sync Ultimate model) vs. AW3423DWF (same panel without the G-Sync Ultimate module)

#79

N3utro

The Amount of VRAM Is Dependent On GPU Architecture
Gamers often wonder why a graphics card has a certain amount of VRAM. Current-generation GDDR6X and GDDR6 memory is supplied in densities of 8 GB (1 GB of data) and 16Gb (2 GB of data) per chip. Each chip uses two separate 16-bit channels to connect to a single 32-bit Ada memory controller. So a 128-bit GPU can support 4 memory chips, and a 384-bit GPU can support 12 chips (calculated as bus width divided by 32). Higher capacity chips cost more to make, so a balance is required to optimize prices.

On our new 128-bit memory bus GeForce RTX 4060 Ti GPUs, the 8 GB model uses four 16Gb GDDR6 memory chips, and the 16 GB model uses eight 16Gb chips. Mixing densities isn't possible, preventing the creation of a 12 GB model, for example. That's also why the GeForce RTX 4060 Ti has an option with more memory (16 GB) than the GeForce RTX 4070 Ti and 4070, which have 192-bit memory interfaces and therefore 12 GB of VRAM.

Is it me or are they contradicting themselves in this? They say 128bit can support 4 memory chips then say the 4060 ti uses 8 chips?

4060 ti = 128 bit => 128/32 = 4 chips => 8GB 4060 ti uses 4 x 2GB memory chips and the 16GB version uses 4 x 4GB memory chips.

4070 = 192 bit => 192/32 = 6 chips => 6 x 2GB memory chips. Which means they could launch a 6 x 4GB = 24GB 4070 if they wanted.

#80

Dr. Dro

N3utroIs it me or are they contradicting themselves in this?

4060 ti = 128 bit => 128/32 = 4 chips => 8GB 4060 ti uses 4 x 2GB memory chips and the 16GB version uses 4 x 4GB memory chips.

4070 = 192 bit => 192/32 = 6 chips => 6 x 2GB memory chips. Which means they could launch a 6 x 4GB = 24GB 4070 if they wanted.

No, they cannot, as there are no 32 Gbit GDDR6X modules available in the market. 24 GB would be achievable by using 12 16 Gbit chips in a clamshell configuration, and that's just too expensive for a gaming card of this price.

#81

sLowEnd

Dr. DroThe only reason FreeSync took off was cost. Hardware G-Sync is still technically the best, but the added cost and the fact monitors have been steadily improving and panels themselves handling ranges better, it just makes it a very unattractive proposition. This is true even today, see: Alienware AW3423DW (G-Sync Ultimate model) vs. AW3423DWF (same panel without the G-Sync Ultimate module)

Cost is a big consideration when looking to set widely adopted standards :laugh:

#82

N3utro

Dr. DroNo, they cannot, as there are no 32 Gbit GDDR6X modules available in the market. 24 GB would be achievable by using 12 16 Gbit chips in a clamshell configuration, and that's just too expensive for a gaming card of this price.

How can they reach 16 GB of vram on a 4060 ti with a max of 4 chips on a 128 bit bus with only 16Gb memory chips then? I dont get it.

#83

Dr. Dro

N3utroHow can they reach 16 GB of vram on a 4060 ti with a max of 4 chips on a 128 bit bus with only 16Gb memory chips then? I dont get it.

By using 8 chips in a clamshell configuration and raising cost accordingly - that is why the 16 GB version both consumes more energy (higher TDP) and costs $100 more.

For example, the RTX 3080 Ti and the RTX 3090 are both 384-bit cards, but the 3080 Ti has 12 chips installed and the 3090 has 24 (with two attached to each channel and installed on both sides of the PCB).

#84

N3utro

Dr. DroBy using 8 chips in a clamshell configuration and raising cost accordingly - that is why the 16 GB version both consumes more energy (higher TDP) and costs $100 more.

For example, the RTX 3080 Ti and the RTX 3090 are both 384-bit cards, but the 3080 Ti has 12 chips installed and the 3090 has 24 (with two attached to each channel and installed on both sides of the PCB).

I see thank you professor! :D

But by this logic, a "clamshelled" 4060 ti for $100 increase means "clamshelling" each original single chip cost around $25 right?

So a "clamshelled" 4070 would be a 6 x 25$ increase by this process, which would mean $750 for a 24GB 4070 vs $600 for a 12GB 4070 and $950 for a 24GB 4070 ti vs $800 for a 12GB 4070 ti.

Seeing that the 16GB 4080 is $1200 i dont see how a $800 4070 ti would be unreasonable.

In other words: if they did it for the 4060 ti, why not for the 4070 and 4070 ti? I would have probably bought a 24GB 4070 for $750 instead of my $600 12GB 4070 because over time the card will hold more value as games vram requirements will increase. Same reason why 6GB gtx 1060 is worth much more now than a 4GB one.

#85

HisDivineOrder

The biggest takeaway from me for this is that Nvidia is actually aware and concerned about the constant talk about VRAM. I actually am shocked. I thought they were above it all. To go to the trouble of making a new 16GB variant of the 4060 Ti plus a whole web page about how it's not them, it's you, is way more than I expected as a reaction. Perhaps next gen they'll temporarily boost memory up to merely "acceptable" levels unlike recently when they always do "intolerably low" for every market segment.

The best part was when they said how the 4070 Ti had to have 12GB, but the 4060 Ti gets 8 or 16GB. But- but- but Nvidia. 24GB exists, you know. They could have doubled memory at every level. They could have. They just didn't want to. Given the markup and the record profits they're reporting, they could have easily absorbed the cost.

But they didn't want to do that. Now that the jig is up, they're in full-on damage control. Jensen took a minor paycut on the lowest part of his compensation package and he is angry that people are angry because they aren't buying anything they shove out the door. That's why he's mad. He had them millions earmarked for his latest yacht. Him and Bobby Kotick are grumpy because they're having to delay their competition to be the first to get a Bezos-level superyacht.

#86

Dr. Dro

N3utroI see thank you professor! :D

But by this logic, a "clamshelled" 4060 ti for $100 increase means "clamshelling" each original single chip cost around $25 right?

So a "clamshelled" 4070 would be a 6 x 25$ increase by this process, which would mean $750 for a 24GB 4070 vs $600 for a 12GB 4070 and $950 for a 24GB 4070 ti vs $800 for a 12GB 4070 ti.

Seeing that the 16GB 4080 is $1200 i dont see how a $800 4070 ti would be unreasonable.

In other words: if they did it for the 4060 ti, why not for the 4070 and 4070 ti? I would have probably bought a 24GB 4070 for $750 instead of my $600 12GB 4070 because over time the card will hold more value as games vram requirements will increase. Same reason why 6GB gtx 1060 is worth much more now than a 4GB one.

It is hard to estimate the exact cost because Nvidia signs tailored supply contracts ahead of time, so the prices that they pay on each unit may be lower or higher than the average cost of each unit in the regular bulk market. But rest assured, it's less than $25 per chip, significantly less. $25 or so was rumored to be the cost of single GDDR6X chips when they were brand new 3 years ago, and that's why there was a very large difference in price going from the 3080 to the 3090.

The reason they don't add more memory to consumer-grade graphics cards is a business one, they don't want businesses to buy them and want to sell their enterprise products instead. Nvidia's MSRP price for the RTX 4090 is $1600 USD, but its equivalent professional card (RTX 6000 Ada Generation) with 48 GB costs a cool $6800.

There is also another concern that is an undesirable for gaming cards is that this memory isn't free in terms of power consumption. The original non-Ti RTX 3090 is the biggest example of that, I will link you this thread where I was discussing it a few weeks ago:

www.techpowerup.com/forums/threads/amd-radeon-rx-7600-early-sample-offers-rx-6750-xt-performance-at-175w-rumor.307792/page-4#post-5005655

#87

londiste

Chrispy_That sounds like of lot of PR/marketing copium and damage control. The article completely sidesteps the issue of the VRAM having to hold the texture assets. Sure, larger cache means shader functions have fewer cache hits, but that's like 2% of what VRAM is actually used for by games and game delelopers.

Half/most of that is explaining the relevance of cache and why the memory bus sizes have been going down. VRAM size problems are part of this due to available memory chip sizes.

Btw, this is not a Nvidia thing. AMD did the large cache thing with RDNA2 and reduced cache sizes in RDNA3 - looking at RDNA2, RDNA3 and Ada they are trying to hone in on the sweetspot of cache size and performance benefit. AMD will have the same choices in front of them and there will be some cool marketing to accompany it.

#88

fevgatos

Vayra86Well I have to give Nvidia some credit for their honesty. I mean this is like a coming out, even if they don't realize it, they are confirming more suspicion than they've removed. They do that especially in the lines where they say even a 4060ti will benefit from 16GB for higher IQ settings. They know they can't fight facts.

Imagine buying an 8GB card with all this information. You'd be pretty damn stupid. Especially if you know RT also demands added VRAM - the very feature Nvidia itself pushes.

I've been very vocal against the latest "8gb not enough" clickbait drama caused by youtubers, I have to admit I was talking about the now almost 3 year old 3070 and 3060ti. For those cards, 8gb was fine. Yes you have to drop textures in a couple of games to high - but again - we are talking about 3 year old midrange cards, that's normal.

But a brand new 4060 having 8gb is just...uhm....let's just say...suboptimal. It might be fine today even for ultra textures due to the extra cache (not sure - waiting for reviews), but man, 3 years down the road they will age much much worse than the ampere 8gb cards did. The 4060 and the 4060ti should have a single model with 12gb and call it a day. I don't know wth is the leatherman doing, it really doesn't make sense to release 8gb, and I can't explain that by just "greediness".

#89

Broken Processor

Or you know just give more vram instead and you wouldn't have to spend money on marketing.
Nvidia decided that they and their shareholders didn't want the COVID/mining gravy train to end.

Let's not mince words this is the most ripoff series launch Nvidia has ever done they moved cards down a silicon teir again and doubled the prices.

#90

fevgatos

Dr. DroYeah, stretching out another couple hundred bucks for the 7900 XT if possible seems generally sensible to me. I'm personally not sold on DLSS 3, I would maybe be more lenient with it if Nvidia didn't willingly withhold it from us 30 series owners, but I already tend to keep traditional DLSS off whenever possible, so frame generation couldn't possibly sway me either way.

RT is of questionable value, but frame generation is going to make or break these lower-end cards. Nvidia is fully accounting its frame generation technology into the general performance uplift and they strongly encourage you to enable it regardless of impact on image quality. Regarding Ada's lowest segments (such as 4050 mobile), you are essentially expected to use DLSS3 FG to achieve playable frame rates. Sucks to be you if the game you want to play doesn't support it, mail your dev requesting it or just don't be poor I guess.

FG is great - not when you are GPU bound - but when you are CPU bound. Hogwarts is almost unplayable without it, no matter what CPU and GPU you have. FG really shines on that game.

#91

Dr. Dro

fevgatosFG is great - not when you are GPU bound - but when you are CPU bound. Hogwarts is almost unplayable without it, no matter what CPU and GPU you have. FG really shines on that game.

Shortcut to performance regardless... I'm giving it a hard pass :laugh:

#92

fevgatos

Dr. DroShortcut to performance regardless... I'm giving it a hard pass :laugh:

Well there is no other option, you either don't play the game or activate FG. I tried overclocking my 12900k to 5.6ghz all core at 1.64 volts, it was melting at 114c but hogwarts was not budging, certain areas dropped me below 60. That was on a fully tuned 12900k with manual ram. It's just one of those games...

#93

gffermari

Dr. DroThe reason they don't add more memory to consumer-grade graphics cards is a business one, they don't want businesses to buy them and want to sell their enterprise products instead. Nvidia's MSRP price for the RTX 4090 is $1600 USD, but its equivalent professional card (RTX 6000 Ada Generation) with 48 GB costs a cool $6800.

Let's write it again.
nVidia CANNOT put large amount of vram on their consumer cards because the latter have millions of uses far from gaming.
That's why putting some more vram is a different model, A series, Quadro and cost 3 to 6 times more.

nVidia is extremely careful where they put additional vram. For example 1080Ti or 2080Ti with 11GB of VRAM perform ages better today compared to identically performant 3070 etc. Yes but the consumer paid 1000$+ for the 2080Ti....and you remember the gold Titan RTX with 24 GB of VRAM and nearly 2.5 times the price of the 2080Ti.

Yes, they put 12GB in the 3060. The card is quite slow for the prosumer/professional, so the added vram does not make it attractive.
Yes, they put 16GB in the 4060Ti which seem to perform 10-15% faster than the 3060Ti, so the same issue again.

The VRAM is extremely valuable in nVidia cards, so they will do everything to not put enough, unless you pay for a bigger model.
It's worth developing a software trick to compensate the necessity of loads of vram than adding more vram in their cards.
And that's what is coming next.

#94

londiste

NostrasYou jest right?
There's a difference between adding extra cache in an attempt to give your cards an edge versus adding extra cache in an effort to save some money by skimping on the VRAM/bus width.

When AMD does it, it is to give your cards an edge.
When Nvidia does it, it is to skimp on VRAM/bus width.
Got it.

Chrispy_xx60 class has always represented "the sweet spot" and for gamers, the sweet spot moved on from 1080p60 a long time ago.
IMO, the sweet spot has been 1440p high refresh with VRR for years now. You don't need to always get >144 fps but an average of ~90fps with 1% lows of over 60 is a good place to be.

I think you are wrong on this one. 1080p is still the most mainstream monitor resolution and 1440p high refresh is still a very very heavy use case.
Games are still getting heavier on the GPU - outside the VRAM thing - and the resolutions and refresh rates are not moving on as much.
For enthusiasts yes, 2160p@120 and even above has become a thing but it also basically requires cards at price points that were not a thing a few generations back.

#95

Vayra86

londisteHalf/most of that is explaining the relevance of cache and why the memory bus sizes have been going down. VRAM size problems are part of this due to available memory chip sizes.

Btw, this is not a Nvidia thing. AMD did the large cache thing with RDNA2 and reduced cache sizes in RDNA3 - looking at RDNA2, RDNA3 and Ada they are trying to hone in on the sweetspot of cache size and performance benefit. AMD will have the same choices in front of them and there will be some cool marketing to accompany it.

And yet they offer an RDNA3 line up that offers 20-24GB starting at the level of Nvidia's 12GB.

It matters a lot how these choices are timed... Nvidia has been dropping VRAM since Turing already. AMD hasnt honed in on anything just yet, but DO use larger cache. I dont entirely believe Nvidia needs to cut it down as it does 'for gaming' nor that AMD feels like it faces the same issues wrt profitability.

Keep in mind the consoles are a major factor where RDNA finds alignment. AMD isnt going to paint itself in a corner and we already see how they dont have those typical Nvidia struggles wrt game stability, esp when VRAM is involved.

Pricing then. I think the pricing (of RDNA2-3; not of Ada) is fine IF you get a piece of hardware that can happily run shit for >5 years. But if the expiry date has been pulled forward to 3 years like we see on Ampere midrange today... thats a big box of nope to me.

#96

Dr. Dro

fevgatosWell there is no other option, you either don't play the game or activate FG. I tried overclocking my 12900k to 5.6ghz all core at 1.64 volts, it was melting at 114c but hogwarts was not budging, certain areas dropped me below 60. That was on a fully tuned 12900k with manual ram. It's just one of those games...

In that case it's the game itself, mate. A 12900K is a monster of a CPU, it should be murdering every game out there for the next 5 years at a minimum. In those cases I just lower settings or better yet, don't play the game at all until it's either fixed or 75% off :laugh:

Vayra86And yet they offer an RDNA3 line up that offers 20-24GB starting at the level of Nvidia's 12GB.

Primarily because their workstation cards are confined to an extremely specific niche and the market share for them is very small. Nvidia owns the visualization and creative professional market. They don't stand much to lose by releasing Radeon cards with high VRAM, but even then, the gaming variants usually have only half of the VRAM of the Radeon Pro GPUs. Nvidia just usually shaves a little extra.

londisteI think you are wrong on this one. 1080p is still the most mainstream monitor resolution and 1440p high refresh is still a very very heavy use case.
Games are still getting heavier on the GPU - outside the VRAM thing - and the resolutions and refresh rates are not moving on as much.
For enthusiasts yes, 2160p@120 and even above has become a thing but it also basically requires cards at price points that were not a thing a few generations back.

I've been using 1080p 60 Hz myself. A few months ago I grabbed one of Samsung's quantum dot Frame TVs when my Sony X900F kicked it. Told you guys the story I believe, infested with ants. It looks fantastic and is quite comfortable to look at (despite being smaller than I'd like it to be), and ultimately I think this is what people value more on a monitor instead of raw Hz, resolution or whatever. Something that's enjoyable to look at.

I've been wanting to purchase a high-end LG OLED, and that is indeed going to be my next big tech purchase (I'm just skipping this generation of GPUs entirely) but honestly, no rush. As long as ultra settings are achievable, 4K 60 is fine, btw. No need for more.

#97

Wirko

Nvidia, and others too, should at least start experimenting with two-bits-per-cell RAM. Am I joking? No.

#98

Chrispy_

gffermariLet's write it again.
nVidia CANNOT put large amount of vram on their consumer cards because the latter have millions of uses far from gaming.
That's why putting some more vram is a different model, A series, Quadro and cost 3 to 6 times more.

nVidia is extremely careful where they put additional vram. For example 1080Ti or 2080Ti with 11GB of VRAM perform ages better today compared to identically performant 3070 etc. Yes but the consumer paid 1000$+ for the 2080Ti....and you remember the gold Titan RTX with 24 GB of VRAM and nearly 2.5 times the price of the 2080Ti.

Yes, they put 12GB in the 3060. The card is quite slow for the prosumer/professional, so the added vram does not make it attractive.
Yes, they put 16GB in the 4060Ti which seem to perform 10-15% faster than the 3060Ti, so the same issue again.

The VRAM is extremely valuable in nVidia cards, so they will do everything to not put enough, unless you pay for a bigger model.
It's worth developing a software trick to compensate the necessity of loads of vram than adding more vram in their cards.
And that's what is coming next.

As someone who buys Quadro cards for VRAM, the VRAM issue is present on Quadros too.

What Nvidia need to do is double the RAM accross the whole product range, Geforce and Quadro. If that means developing dual-rank GDDR6 controllers, then that's what they need to do, but their stagnation in VRAM capacity for the last 5 years is hurting both enterprise and consumer markets alike.

#99

shadad

let me explain to Nvidia my spent functionality:
increase VRAM and lower RTX 40 series price or stay on shelf.

#100

Nostras

londisteWhen AMD does it, it is to give your cards an edge.
When Nvidia does it, it is to skimp on VRAM/bus width.
Got it.

What? If Nvidia released a 4060Ti with 12GB of VRAM Nvidia could've used it in a way to give an "edge" competition/last-gen.
It's kind of the same thing as AMD releasing a 6700 with 6GB of VRAM and then saying "but we increased cache!!!"

But they did not.

AMD does some stupid shit but they have no blame here.
Maybe we can dogpile on them when we know more about pricing of the 7600.

Add your own comment

NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

139 Comments on NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

Related News

139 Comments on NVIDIA Explains GeForce RTX 40 Series VRAM Functionality

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts