Monday, January 26th 2015

NVIDIA Responds to GTX 970 Memory Allocation 'Bug' Controversy

The GeForce GTX 970 memory allocation bug discovery, made towards last Friday, wrecked some NVIDIA engineers' weekends, who composed a response to what they tell is a non-issue. A bug was discovered in the way GeForce GTX 970 was allocating its 4 GB of video memory, giving some power-users the impression that the GPU isn't addressing the last 700-500 MB of its memory. NVIDIA, in its response, explained that the GPU is fully capable of addressing its 4 GB, but does so in an unusual way. Without further ado, the statement.
The GeForce GTX 970 is equipped with 4GB of dedicated graphics memory. However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section. The GPU has higher priority access to the 3.5GB section. When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands. When a game requires more than 3.5GB of memory then we use both segments.
Continued

We understand there have been some questions about how the GTX 970 will perform when it accesses the 0.5GB memory segment. The best way to test that is to look at game performance. Compare a GTX 980 to a 970 on a game that uses less than 3.5GB. Then turn up the settings so the game needs more than 3.5GB and compare 980 and 970 performance again.

Here's an example of some performance data:
<div class="table-wrapper"><table class="tputbl hilight" cellspacing="0" cellpadding="3"><caption>GTX 970 vs. GTX 980 Memory-Intensive Performance Data </caption><tr><th scope="col">&nbsp;</th><th scope="col">GeForce <br /> GTX 980</th><th scope="col">GeForce <br /> GTX 970</th></tr><tr><th scope="row">Shadow of Mordor</th><td align="right"></td><td align="right"></td></tr><tr class="alt"><th scope="row"><3.5GB setting = 2688x1512 Very High</th><td align="right">72 fps</td><td align="right">60 fps</td></tr><tr><th scope="row">>3.5GB setting = 3456x1944</th><td align="right">55fps (-24%)</td><td align="right">45fps (-25%)</td></tr><tr class="alt"><th scope="row">Battlefield 4</th><td align="right"></td><td align="right"></td></tr><tr><th scope="row"><3.5GB setting = 3840x2160 2xMSAA</th><td align="right">36 fps</td><td align="right">30 fps</td></tr><tr class="alt"><th scope="row">>3.5GB setting = 3840x2160 135% res</th><td align="right">19fps (-47%)</td><td align="right">15fps (-50%)</td></tr><tr><th scope="row">Call of Duty: Advanced Warfare</th><td align="right"></td><td align="right"></td></tr><tr class="alt"><th scope="row"><3.5GB setting = 3840x2160 FSMAA T2x, Supersampling off</th><td align="right">82 fps</td><td align="right">71 fps</td></tr><tr class="alt"><th scope="row"><3.5GB setting = >3.5GB setting = 3840x2160 FSMAA T2x, Supersampling on</th><td align="right">48fps (-41%)</td><td align="right">40fps (-44%)</td></tr></table></div>
On GTX 980, Shadows of Mordor drops about 24% on GTX 980 and 25% on GTX 970, a 1% difference. On Battlefield 4, the drop is 47% on GTX 980 and 50% on GTX 970, a 3% difference. On CoD: AW, the drop is 41% on GTX 980 and 44% on GTX 970, a 3% difference. As you can see, there is very little change in the performance of the GTX 970 relative to GTX 980 on these games when it is using the 0.5GB segment.
Source: The TechReport
Add your own comment

92 Comments on NVIDIA Responds to GTX 970 Memory Allocation 'Bug' Controversy

#51
Steevo
www.dsogaming.com/news/nvidia-gtx-970-owners-report-unusual-vram-behavior-unable-to-efficiently-allocate-more-than-3-5gb/


Looks to be one of the original sources, and like many others have noted, when modding games such as skyrim you run out of memory long before GPU power. Wonder how well GTA5 will do with the lack of more vmem for Nvidia users again?


Also describes many of the users issues forums.evga.com/Games-stuttering-with-GTX-970-and-Vsync-m2222444.aspx with stuttering as the textures required are pulled out of the slower memory. forums.geforce.com/default/topic/777475/geforce-900-series/gtx-970-frame-hitching-/


Current user fix? Run at 30 FPS, just like a console!!!!
Posted on Reply
#52
64K









How exactly is the GTX 970 a failure in real world performance in games?
Posted on Reply
#53
newtekie1
Semi-Retired Folder
GhostRyderGood vid.
I agree. And it gives a good idea of why they decided to do it this way from an engineering standpoint. That 0.5GB is still much faster than having to access system RAM. So it acts as a final buffer before having to start offloading to system RAM. It does in fact make the card faster in situations where you exceed 3.5GB of memory used.

The wrong ROP and L2 amounts does suck, that is something nVidia should address.(Give me a free game and I'll be happy.:))
Posted on Reply
#54
Steevo
64K








How exactly is the GTX 970 a failure in real world performance in games?
For most stock games that use less than 3.5GB of vmem its great. as soon as you step over that it starts to stutter even though it has the GPU horsepower to run it. So the issue is all the people who bought a 4GB card to run modded skyrim, and other games that use vmem to its full, who can't.

If you don't understand that it's OK, but there are some who mod games and buy hardware to support it.
Posted on Reply
#55
Sasqui
newtekie1It does in fact make the card faster in situations where you exceed 3.5GB of memory used.)
Faster than paging, yes. Sorry, I find the whole thing kind of amusing. Truth be told, if I could trade my 290x for a 270 I'd still consider it.
Posted on Reply
#56
Xzibit
SasquiFaster than paging, yes. Sorry, I find the whole thing kind of amusing. Truth be told, if I could trade my 290x for a 270 I'd still consider it.
Still haven't gotten your msg yet



Heh. I been meaning to replace a aging 480 with a 270/x then decided to wait until the 960 that disappointed. Now i'll wait to see what the 370 has to offer. Don't want to spend too much on it because I'm still up in the air about either keeping the PC for HTPC or giving it away.
Posted on Reply
#57
newtekie1
Semi-Retired Folder
SteevoFor most stock games that use less than 3.5GB of vmem its great. as soon as you step over that it starts to stutter even though it has the GPU horsepower to run it. So the issue is all the people who bought a 4GB card to run modded skyrim, and other games that use vmem to its full, who can't.

If you don't understand that it's OK, but there are some who mod games and buy hardware to support it.
I play modded Skyrim, it often goes over 3.5GB memory usage on my 970, and the stuttering that people say is so horrible just simply isn't. It isn't nearly as bad as when the GPU has to access system RAM, because that 0.5GB is still way faster than accessing system RAM.

And think about it, people playing these games that go over 3.5GB haven't been complaining about stuttering. The only reason we even noticed this problem was because some people noticed that some programs were saying they were only using 3.5GB and no more when they knew they should be using more. It wasn't because they were experiencing stuttering.
Posted on Reply
#58
Casecutter
In PCPer video at 8:30 they say "It make sense if you have to disable one of L2 for binning purposes, which it was... all for binning purposes..." the other guy then says " Yes, it was all for binning purposes."

I think we need to know is that L2 "cut" as being something that is also done only because the 3 - SM cores being disabled? Or is it that more often one L2 came out bonkers/defective and they could only make the chip go as fast as it does by disabling it?

If we knew the truth either... One of the L2 is "unutilized" due to lower SM count, and they burn one off so that all 970's provide their level of performance then... fine and PR/marketing might have just goofed. Although, if the SM count has nothing to do with the L2 being defective, then I would think someone thought they could pull the wool over folks. Trying to not say they had to burn-off the L2, is messing with the published spec's.
Posted on Reply
#59
trenter
It seems they didn't just lie about the memory, 970 only has 56 ROPS and 1.75 megs of L2 cache.
HumanSmokeHow auspicious. 2 posts in and you've already called out two staff members.
:shadedshu::shadedshu::shadedshu:
Only one member, the one that seems to be infatuated with the nvidia corporation.
Posted on Reply
#60
Eroticus
Some surprising news came from PCPerspective today. After a long debate, hundreds of reports of slower memory buffer of GTX 970, NVIDIA officially admitted that there was a mistake between marketing and engineering teams.

NVIDIA GeForce GTX 970 3.5 GB memory issue

The GM204 diagram below was made by NVIDIA’s Jonah Alben (SVP of GPU engineering) specifically to explain the differences between the GTX 970 and GTX 980 GPU. What was not known till today, and it was falsely advertised by NVIDIA, is that GTX 970 only has 56 ROPs and smaller L2 cache than GTX 980. Updated specs clarify that 970 has one out of eight L2 modules disabled and as a result the total L2 cache is not 2048 KB, but 1792 KB. It wouldn’t probably change anything, however this particular L2 module is directly connected to 0.5 GB DRAM module.

To put this as simply as possible: GeForce GTX 970 has two memory pools: 3.5 GB running at full speed, and 0.5 GB only used when 3.5 GB pool is exhausted. However the second pool is running at 1/7th speed of the main pool.

So technically, till you deplete the memory available in the first pool, you will be using 3.5 GB buffer with 224-bit interface.

Ryan Shrout explains:

In a GTX 980, each block of L2 / ROPs directly communicate through a 32-bit portion of the GM204 memory interface and then to a 512MB section of on-board memory. When designing the GTX 970, NVIDIA used a new capability of Maxwell to implement the system in an improved fashion than would not have been possible with Kepler or previous architectures. Maxwell’s configurability allowed NVIDIA to disable a portion of the L2 cache and ROP units while using a “buddy interface” to continue to light up and use all of the memory controller segments. Now, the SMMs use a single L2 interface to communicate with both banks of DRAM (on the far right) which does create a new concern. (…)

And since the vast majority of gaming situations occur well under the 3.5GB memory size this determination makes perfect sense. It is those instances where memory above 3.5GB needs to be accessed where things get more interesting.

Let’s be blunt here: access to the 0.5GB of memory, on its own and in a vacuum, would occur at 1/7th of the speed of the 3.5GB pool of memory. If you look at the Nai benchmarks (EDIT: picture here) floating around, this is what you are seeing.



NVIDIA GeForce GTX 970 Corrected Specifications
GeForce GTX 970GeForce GTX 970 ‘Corrected’
Picture
GPU28nm GM204-20028nm GM204-200
CUDA Cores16641664
TMUs104104
ROPs6456
L2 Cache2048 KB1792 KB
Memory Bus256-bit256-bit
Memory Size4GB4GB (3.5GB + 0.5GB)
TDP145W145W
Check this video from PCPerspective:


Source: PCPerspective



-----------

NOT GOING TO BE FIXED =[ TO ALL 970 OWNERS UPDATE UR BOX WITH PEN OR SOMETHING....
Posted on Reply
#61
Xzibit
newtekie1I play modded Skyrim, it often goes over 3.5GB memory usage on my 970, and the stuttering that people say is so horrible just simply isn't. It isn't nearly as bad as when the GPU has to access system RAM, because that 0.5GB is still way faster than accessing system RAM.

And think about it, people playing these games that go over 3.5GB haven't been complaining about stuttering. The only reason we even noticed this problem was because some people noticed that some programs were saying they were only using 3.5GB and no more when they knew they should be using more. It wasn't because they were experiencing stuttering.
The vast majority overlook things. Recently the G-Sync 30fps flickering issue. It was even in the PcPer review graphs and they didn't bother looking at until well after production in ASUS ROG Swift models several months after release. Ryan didn't catch it Allyn didn't I beleive it was one of his friends. Even after the modules were updated to minimize the issue.. Blurbusters and other forums had users complaining about it but not until it hits certain people in the face is it widely exposed. Heck I'm sure people still don't know about it or just forgot because its something that they deem it wont effect them.

It comes down to how individuals are effected by it. Like this current debate we have people saying no big deal and others saying its horrible. I rather have the information out there if it effects me or not. The more information we know the better informed decisions one can make.
Posted on Reply
#62
trenter
the54thvoidKinda what Anandtech are saying too.

Nvidia has egg on it's face for 'ahem' lying about it's card, no doubt but the performance of it isn't an issue. Each reviewer looking at it in turn, (PCper, Anand, Hexus) has the same conclusion which is threefold:
1) Nvidia have slipped up and undoubtedly their PR and engineering sections have 'misled' the public somewhat. (IMO, I don't believe it was innocent but hey)
2) The real performance impact isn't there. The card, according to all sites so far, is still great.
3) People are trying and failing so far to find a real world gaming example that kills the cards performance, outside of a load that would do that anyway based on it's SMM units etc.

FWIW, IMO, Nvidia knew fine well what they were releasing and probably expected no fall out from it, due to the fact it has no impact on real scenario's. But techy people like to dig and found an anomaly. Now NV have to explain it and it's hard to make this one sound like a genuine 'miss'. Even if it was a genuine lapse, it's very hard to sell to us, the public.

But hey, this ugly truth (bad move NV, but still a great card) won't stop people throwing those ignorance stones.
Ignorance stones? Only a fanboy in denial, or a complete moron would take nvidia's explanation as truth with zero skepticism. Performance at the time of review isn't the problem, the problem is that people were sold a gpu expecting the specifications of that gpu to be the same four months later as they were at release. Also, people don't just buy a gpu to play only the games on market up to the point of release, they buy them with future performance in mind. You don't think there are some people that may have decided to skip the 970 if they had known there could be problems addressing the full 4gb of ram in the future, especially when console ports are using up to 3.5gb at 1080p now? What about the people that were worried about that 256 bit memory bus? Nvidia pointed out to reviewers that their improved L2 cache would keep memory requests to a minimum, therefore allowing more "achieved" memory bandwidth even with a smaller memory interface. The point is they lied about the specs, and it's almost impossible to believe that this is some kind of big misunderstanding between the engineering and marketing team that they just happened to realize after 970 users discovered it.
Posted on Reply
#63
HTC
newtekie1It isn't. The video is either fake or he has something else going on causing the issue. Just look at his memory usage in the second part. His system memory usage is under 6GB in the first half and over 13GB in the second...
I did say more testing was required and enfasised the literally big "If".
Posted on Reply
#64
Rahmat Sofyan
???
if the specs for the gtx 970 are wrong how come gpu z shows 64 rops
from geforce forums
GPU-Z reads card bios, so the bios has deliberately had this wrong info.
true or not?
Posted on Reply
#65
HumanSmoke
XzibitPerformance wise its not visually apparent with the majority of games but if the so called "Next Gen PS4-XB1 ports" games ever get here with "DX12" the issues will be more apparent to the majority. At least that how I see it. Most games are catching up to DX10+ and the new PS4-XB1 game are being ported with texture packs that are coming in at 3GB at VHQ @ 1080p. Who know by then Nvidia might also have a 1070 that doesn't have this issues.
Well, if 4GB is the new black now, it is almost certain that the requirement in the enthusiast segment will go higher with the headroom available to consoles. With 4Gbit chips now in production, it seems likely that 256-bit/8GB could well be the next stepping stone - at least as far as Nvidia is concerned until HBM gen2 arrives. Not sure how AMD gets around the 4GB limitation for HBM gen1 though.
vRAM capacity in the lower segments has always been more about marketing than real-world gain - you still need the GPU power to fully utilize the framebuffer, otherwise (technically) you could release a 256-bit card with 16GB of vRAM ( 16 chips @ 4Gbit with dual 16-bit I/O - the same reduced I/O that allows a FirePro W9100 to carry 16GB) - might be marketable, but it sure won't be a balanced design.
XzibitI go back to my displeasure of both camps minimizing the offerings and the 970 looks like its was more of a just good enough to replace the 780s. 280->285, 760->960. As consumers we are going to keep getting screwed and it seems more and more of the majority are willing to spread cheeks and take it and brag about how a wonderful experience it was.
Well, both vendors are constrained by the process node, transistor density, die size, and power budget. Any gains made on GPUs using the same 28nm process aren't going to significant compared to moving to a new process. Technically, both vendors could go for broke and churn out 650mm^2 GPUs, but the pricing to recoup costs, lower yields, and limited market would be a killer- and of course a quantum leap in single GPU performance basically starts killing the market for dual cards and multi-card SLI/CFX, unless the software evolves at a similar (or faster) rate. It also doesn't address far larger an more lucrative markets - the low power mobile sector, and shoehorning the latest "must have" features into the mainstream products.
Posted on Reply
#66
Steevo
newtekie1I play modded Skyrim, it often goes over 3.5GB memory usage on my 970, and the stuttering that people say is so horrible just simply isn't. It isn't nearly as bad as when the GPU has to access system RAM, because that 0.5GB is still way faster than accessing system RAM.

And think about it, people playing these games that go over 3.5GB haven't been complaining about stuttering. The only reason we even noticed this problem was because some people noticed that some programs were saying they were only using 3.5GB and no more when they knew they should be using more. It wasn't because they were experiencing stuttering.
Look at the dates in the threads I listed before you start spewing will ya? I was contemplating a 970 until I started reading all the stuttering and glitching issues thy were plagued with, I had money in hand, instead spent money on close performance in this card for half the price and a bunch of new games.
Posted on Reply
#67
newtekie1
Semi-Retired Folder
EroticusNOT GOING TO BE FIXED =[ TO ALL 970 OWNERS UPDATE UR BOX WITH PEN OR SOMETHING....
I was going to use a sharpie, but then I realized none of the affected specs are actually listed on the box, because GPUs aren't marketed based on L2 cache size and ROPs...
SteevoLook at the dates in the threads I listed before you start spewing will ya? I was contemplating a 970 until I started reading all the stuttering and glitching issues thy were plagued with, I had money in hand, instead spent money on close performance in this card for half the price and a bunch of new games.
What the f*(k are you talking about? I'm not going to look through a bunch of threads to find your useless posts.
Posted on Reply
#68
TRWOV
This thread has been awarded 3 popcorn MJs :clap: Congrats!!!

Posted on Reply
#69
Steevo
newtekie1I was going to use a sharpie, but then I realized none of the affected specs are actually listed on the box, because GPUs aren't marketed based on L2 cache size and ROPs...


What the f*(k are you talking about? I'm not going to look through a bunch of threads to find your useless posts.
Not mine, but threads at Nvidia about stuttering from right after release, 18 pages long, game threads about stuttering.
Posted on Reply
#70
lukesky
Xzibit
Nvidia surely is a master of the 'force.
Posted on Reply
#71
15th Warlock
www.anandtech.com/show/8935/geforce-gtx-970-correcting-the-specs-exploring-memory-allocation

Holy shit, cat's out of the bag, the hardware spec sheet given to review sites was wrong, the 970 does in fact feature less ROPs and cache than the 980, besides the divided VRAM partition mentioned before.

The card uses the first 3.5GBs of VRAM at full bandwidth for the memory crossbar accessing 7 memory modules, but the remaining 512MBs have to be accessed in tandem at a much lower bandwidth due to the single channel nature of the separate memory crossbar, faster than regular PCIe bandwidth but many times slower than the high performance 3.5GBs of VRAM in the first partition. So the card technically speaking has 4GBs of VRAM but the uppermost segment of it is almost an order of magnitude slower than the first chunk of memory.

The card still is a solid performer, and probably the best bang for your buck for gaming at 1440p and bellow, but Nvidia made a big no-no here, and they must be in full damage control mode :shadedshu:
Posted on Reply
#72
Xzibit
15th Warlockwww.anandtech.com/show/8935/geforce-gtx-970-correcting-the-specs-exploring-memory-allocation

Holy shit, cat's out of the bag, the hardware spec sheet given to review sites was wrong, the 970 does in fact feature less ROPs and cache than the 980, besides the divided VRAM partition mentioned before.

The card uses the first 3.5GBs of VRAM at full bandwidth for the memory crossbar accessing 7 memory modules, but the remaining 512MBs have to be accessed in tandem at a much lower bandwidth due to the single channel nature of the separate memory crossbar, faster than regular PCIe bandwidth but many times slower than the high performance 3.5GBs of VRAM in the first partition. So the card technically speaking has 4GBs of VRAM but the uppermost segment of it is almost an order of magnitude slower than the first chunk of memory.

The card still is a solid performer, and probably the best bang for your buck for gaming at 1440p and bellow, but Nvidia made a big no-no here, and they must be in full damage control mode :shadedshu:
Here is that page



If you really want a chuckle look at this page.



Nvidia GeForce GTX 980/970 Reviewer's GuideEquipped with 13 SMX units and 1664 CUDA Cores the GeForce GTX 970 also has the rending horsepower to tackle NEXT GENERATION GAMING. And with its 256-bit memory interface, 4GB frame buffer, and 7Gbps memory the GTX 970 ships with THE SAME MEMORY SUBSYSTEM AS OUR FLAGSHIP GEFORCE GTX 980, allowing gamers to crank up the settings and resolutions in graphic-intensive games like Assasin's Creed: Unity and still enjoy fluid frame rates.
Posted on Reply
#73
xorbe
Am I reading into it wrongly, or is the 970 basically operating in 224-bit mode most of the time, and not 256-bit mode?
Posted on Reply
#74
Steevo
XzibitHere is that page



If you really want a chuckle look at this page.

NOOOO NVIDIA IS OUR BULL GOD!!!!!!!


Seriously though, for most its still a good deal.
Posted on Reply
#75
R-T-B
FluffmeisterThe price will likely drop a bit anyway once AMD actually have a new product to sell, the current fire sale of products in a market already flooded with cheap ex-miners clearly doesn't make much difference.
Wait a second, ex-miners? That was so 2013. I haven't seen cards used in mining at any profitable margins since mid-2014, and even then nearly no one was doing it anymore.

There might still be a few on the market, but I doubt it.
Posted on Reply
Add your own comment
Apr 19th, 2024 17:00 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts