• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Article: Just How Important is GPU Memory Bandwidth?

Joined
Dec 29, 2014
Messages
861 (0.25/day)
Hey, you haven't lived (and nor will you) until you've played a FPS at 4K with a mainstream card.

NO AA?! I need AA... I'll take the fps hit...
 
Joined
Apr 19, 2012
Messages
12,062 (2.77/day)
Location
Gypsyland, UK
System Name HP Omen 17
Processor i7 7700HQ
Memory 16GB 2400Mhz DDR4
Video Card(s) GTX 1060
Storage Samsung SM961 256GB + HGST 1TB
Display(s) 1080p IPS G-SYNC 75Hz
Audio Device(s) Bang & Olufsen
Power Supply 230W
Mouse Roccat Kone XTD+
Software Win 10 Pro
EDIT: Adjusted first group of graphs. SPG2's overall data graph was mistakingly replaced with the PCIe Bus/Memory usage graph.

The big question I have, is can you say the same for the 2GB of vram? Would that scale with GPU load as well?

Judging by the few games I've tested, Modern AAA titles would cripple the 2GB VRAM in a vaccuum. As we know, VRAM tends to operate differently when the quantities are altered. One game will use 2.5GB VRAM on a 4GB card, the same game on the same settings will use 1.8GB VRAM on a 2GB card. Sometimes cards let games stretch their legs when there's excess VRAM available. That's not the case in every single game, but it often happens. I don't know whether the game holds itself back for the lowend cards or stretches on the high end cards, or both.

If we took my results of VRAM usage (which ARE accurate), then the 2GB would certainly hold the card back on Very High (or High)/Ultra settings on 1080p. You'd have to start toning down the aliasing and textures.

And is there any way to tell how much vram is really needed (vs allocated) without testing identical cards with different amounts of vram?

Answered your own question :D You'd have to keep your eye on FPS drops, and system RAM usage too, to see exactly where it spills over and starts affecting performance.
 
Last edited:

HammerON

The Watchful Moderator
Staff member
Joined
Mar 2, 2009
Messages
8,397 (1.53/day)
Location
Up North
System Name Threadripper
Processor 3960X
Motherboard ASUS ROG Strix TRX40-XE
Cooling XSPC Raystorm Neo (sTR4) Water Block
Memory G. Skill Trident Z Neo 64 GB 3600
Video Card(s) PNY RTX 4090
Storage Samsung 960 Pro 512 GB + WD Black SN850 1TB
Display(s) Dell 32" Curved Gaming Monitor (S3220DGF)
Case Corsair 5000D Airflow
Audio Device(s) On-board
Power Supply EVGA SuperNOVA 1000 G5
Mouse Roccat Kone Pure
Keyboard Corsair K70
Software Win 10 Pro
Benchmark Scores Always changing~
Thanks @RCoon for taking the time to do this:toast:
I find your results very interesting to say the least.
 
Joined
Aug 3, 2013
Messages
259 (0.07/day)
Location
Pakistan
System Name The N Machinima
Processor Core i5 2500 (2nd Gen)
Motherboard MSI P67A GD65 B3
Cooling CM V6 GT
Memory Corsair dominator 2x2GB 1600mhz
Video Card(s) Gigabyte GTX 760 windforce 3x
Storage Seagate 500GB/WD 500GB HDDs
Display(s) Samsung 22 Inch 1080p
Case CM 690 PLUS
Audio Device(s) Onboard
Power Supply Corsair GS 800
Software Windows 7 x64
Benchmark Scores http://i.imgur.com/0O79u7Z.jpg
The way Article brought up, definitely @RCoon put up grand effort on it. i do Appreciate your work. thats really informative for us.

The Memory bandwidth does have role in overall gaming, from your Testing process and results i can say, lesser memory bandwidth does create a bottleneck even n 1080p ont by huge but to some extend. , specifically 960 placed with 112GB/s which is already little lower. and when you applied higher AA, it does create impact on overall performance while gaming. MCL correlated with GPU LOAD, so if assume not 100% but if 60% still it will make difference not by huge but still there will be.

960 with 112GB/s and 2GB VRAM performance would be equal or better to 7950/r9 280, +/- Difference. but both elements seems would create some bottleneck for ULTRA or very HIGH gaming @1080p. Especially VRAM is low, as these days games consumes more than 2GB vRAM, 3GB atleast good 2 go.
 

Mussels

Freshwater Moderator
Staff member
Joined
Oct 6, 2004
Messages
58,413 (8.21/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
This was a lot of work i am sure. Thanks for bringing it up.

Its nice to see something, and I use this term loosely as you do essentially, 'concrete' on the issue. I though, like newconroer, find this 'proves' what people know already (but could never put their finger on it). I just wish we could have concrete numbers to base the data off of. Its a logical leap, but lord knows without actual/factual data to start with, if it extrapolates out to fact.

People just need to know that, regardless of the bandwidth, what the FPS say is what you will get regardless. Another way to put it, I have the same 4 cars with different motors and they all run 12s 1/4 mile... one does it N/A, one boosted with a snail, the other a screw, and the other a rotary. It doesn't matter how it gets there, just that it does. :)


i think thats due to hardware design limitations.

The ram always has to be in preset amounts, such as 128 bit, 256, 384, 512 etc.

*Purely theoretical with made up numbers*
What if they design a card that works awesome with 256 bit and say 2GHz ram - but the 2GHz ram has supply issues, so they move to 384 bit 1.5GHz ram - suddenly they have more ram bandwidth than the card needs, but its the only financially viable redesign option at that point. The ram could OC really well, but provide no gains at all.
This kinda thing could also happen in reverse, when they halve the memory bandwidth on a mid-range or entrey level card, but due to budget they cant use fast enough ram, and suddenly ram OC gives large benefits to that model.
 
Joined
Aug 3, 2013
Messages
259 (0.07/day)
Location
Pakistan
System Name The N Machinima
Processor Core i5 2500 (2nd Gen)
Motherboard MSI P67A GD65 B3
Cooling CM V6 GT
Memory Corsair dominator 2x2GB 1600mhz
Video Card(s) Gigabyte GTX 760 windforce 3x
Storage Seagate 500GB/WD 500GB HDDs
Display(s) Samsung 22 Inch 1080p
Case CM 690 PLUS
Audio Device(s) Onboard
Power Supply Corsair GS 800
Software Windows 7 x64
Benchmark Scores http://i.imgur.com/0O79u7Z.jpg
This kinda thing could also happen in reverse, when they halve the memory bandwidth on a mid-range or entrey level card, but due to budget they cant use fast enough ram, and suddenly ram OC gives large benefits to that model.

Such as kepler's 660Ti, 192bit GB/s and have immense overclocking potential on RAM side. Midrange card do have great overclocking potential that let you get performance near or equal to next card of series.
 
Joined
Dec 29, 2014
Messages
861 (0.25/day)
You are wholly correct. It's all done on a 970, and judging by the fact I discovered that memory controller load is directly correlated with GPU load, we can assume that the lower the maximum GPU load, the lower the memory bandwidth will be.

Since I have a 750, I figured I'd see if I could work this from the other end. The 960 is exactly 2x a 750 in everything but bandwidth. It has the same 128bit bus, but 7Ghz vs 5Ghz vram, so the 960 has about 40% higher bandwidth.

I ran a couple of games and Heaven on high settings (1080p) and got the same max MCU load in each case... 78%. GPU load was at the max, as well as the allocation of 1GB of vram. Since I got the same number every time, I'm guessing that might really be as high as it goes.

My card is overclocked. Compared to reference it's 1310/1085 or 21% on the GPU and 5900/5010 or 18% on the vram. The memory bandwidth is 94.4 GB/s.

If I understand your method correctly, you multiply the MCU load by 70% in an attempt to account for Maxwell compression. I assume this is because the MCU reading from the card is in error by this factor? I don't see where you mentioned that, but might have missed it. If I apply that factor I get a max MCU load of only 55%, or 51.5 GB/s.

So if the 960 processor is 2x as fast as the 750, then I'd expect it to need a bandwidth of 103 GB/s before it hits the processor limit. Does that make sense?
 
Joined
Apr 19, 2012
Messages
12,062 (2.77/day)
Location
Gypsyland, UK
System Name HP Omen 17
Processor i7 7700HQ
Memory 16GB 2400Mhz DDR4
Video Card(s) GTX 1060
Storage Samsung SM961 256GB + HGST 1TB
Display(s) 1080p IPS G-SYNC 75Hz
Audio Device(s) Bang & Olufsen
Power Supply 230W
Mouse Roccat Kone XTD+
Software Win 10 Pro
The memory bandwidth is 94.4 GB/s.

If I understand your method correctly, you multiply the MCU load by 70% in an attempt to account for Maxwell compression. I assume this is because the MCU reading from the card is in error by this factor? I don't see where you mentioned that, but might have missed it. If I apply that factor I get a max MCU load of only 55%, or 51.5 GB/s.

I took the total memory bandwidth (your case 94.4GB/s), divided it by 100 (100%). I then multiplied it by MCU (your case 78%). I then divided my figure by 70, and multiplied it by 100 in order to get my "pre-compression figure".
(94.4 / 100) * 78 = X (X = Bandwidth usage with maxwell compression)
(X / 70) * 100 = Y (Y= Bandwidth usage without maxwell compression (Assuming it is exactly to 30%))
So by those accounts, your normal bandwidth usage would be 73.6 GB/s on the 750.
Without Maxwell compression, the figure would be 105.1GB/s
 
Last edited:
Joined
Dec 29, 2014
Messages
861 (0.25/day)
Oh, I see... it's just .78x94.4 or 73.6 GB/s. That would mean the 960 would need 147.2 GB/s for double the speed. I don't think the vram will overclock that much (>30%).
 
Joined
Apr 19, 2012
Messages
12,062 (2.77/day)
Location
Gypsyland, UK
System Name HP Omen 17
Processor i7 7700HQ
Memory 16GB 2400Mhz DDR4
Video Card(s) GTX 1060
Storage Samsung SM961 256GB + HGST 1TB
Display(s) 1080p IPS G-SYNC 75Hz
Audio Device(s) Bang & Olufsen
Power Supply 230W
Mouse Roccat Kone XTD+
Software Win 10 Pro
Oh, I see... it's just .78x94.4 or 73.6 GB/s. That would mean the 960 would need 147.2 GB/s for double the speed. I don't think the vram will overclock that much (>30%).

Not necessarily, the correlation between GPU speed and memory bandwidth usage probably isn't a linear 1:1 ratio. If it were people probably would have discovered all that by now :D
 
Joined
Dec 31, 2009
Messages
19,366 (3.72/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
i think thats due to hardware design limitations.

The ram always has to be in preset amounts, such as 128 bit, 256, 384, 512 etc.

*Purely theoretical with made up numbers*
What if they design a card that works awesome with 256 bit and say 2GHz ram - but the 2GHz ram has supply issues, so they move to 384 bit 1.5GHz ram - suddenly they have more ram bandwidth than the card needs, but its the only financially viable redesign option at that point. The ram could OC really well, but provide no gains at all.
This kinda thing could also happen in reverse, when they halve the memory bandwidth on a mid-range or entrey level card, but due to budget they cant use fast enough ram, and suddenly ram OC gives large benefits to that model.
Ram speed and its 'bits' don't have anything to do with each other really.

I see the point of this post however. It is what is going on right now for all intents and purposes. AMD put a massive 512bit bus and slow ram. While NVIDIA is using a slower bus and faster ram IC's.
 
Joined
Dec 29, 2014
Messages
861 (0.25/day)
AMD put a massive 512bit bus and slow ram. While NVIDIA is using a slower bus and faster ram IC's.

My GTX 750 isn't bandwidth limited, but increasing the vram clock 20% resulted in a 7% speed increase. I'm thinking that faster vram is a better solution.
 
Joined
Dec 31, 2009
Messages
19,366 (3.72/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
My GTX 750 isn't bandwidth limited, but increasing the vram clock 20% resulted in a 7% speed increase. I'm thinking that faster vram is a better solution.
Missed this reply...

It depends on what you are playing and its resolution. 1080p + AA on 128 bit, it could be a factor. It also only has 1GB of vRAM. I wouldn't even call it a gaming card in the first place...
 
Joined
Jan 2, 2015
Messages
1,099 (0.33/day)
Processor FX6350@4.2ghz-i54670k@4ghz
Video Card(s) HD7850-R9290
750 is ok for gaming.. will hurt in some games especially with the 1 gb of vram at 1080p but its pretty good at lower resolutions.
 
Joined
Sep 15, 2008
Messages
1,050 (0.19/day)
Location
Pikeville NC
System Name In the works
Processor 4670k
Motherboard ASUS Z87
Cooling 5x120mm
Memory 8GB Sniper Gskill 1833
Video Card(s) MSI 970 Gaming
Storage 2x 240gb SSD's + 1TB seagate
Display(s) 27" Acer 1920x1080
Case Corsair Carbide 200R
Power Supply OCZ Power 600W Modular
Keyboard Corsait K70 with red switches
Software Win10
What program do you guys use to make all the graphs?
 
Joined
Feb 9, 2009
Messages
1,618 (0.29/day)
i like charts, great investigation

i still wonder how important the size of vram is... i put hardline beta on ULTRA (except AA & AO) on my 570m, the game quickly goes to 1.5gb vram usage, no problems!

my new 660, also no problems other than 4gb system ram is unplayable in the beta, plus the same 1.5gb vram usage due to a similar situation as the 970 hysteria

yet the requirements recommend 3gb & people say 'oh you have to turn down settings for 2gb cards' ... i am on an overclocked MOBILE FERMI with only 1.5gb getting 30fps & beyond at 1080p

(while yes i switched to FXAA & SSAO, it's not like the vram usage lowered, MSAA still had max gpu usage, simply lower fps from ~30 to ~20 as if it was entirely gpu processing power related, not an issue of stuttering from lack of vram or bandwidth... the 660 was simlar, from ~60fps to ~50fps when MSAA is turned on, same 1.5gb vram usage)

what kind of crappy game engine needs to preallocate everything? it should be STREAMED so that you can have pseudo unlimited worlds & so that consoles can load the data as needed

now that i think about it some more, i have seen what happens when you're out of vram, but only on my 4870x2 with 1gb when GTA4 has its view distance too high or in crysis2 with the hq textures enabled, both cases resulted in stuttering & also the crysis textures lost their filtering so they were pixelated (actually i had the same pixelation when using texture packs in rfactor on a 128mb 9800pro)

so maybe i should rephrase... what kind of crappy modern AAA game engine with GBs of assets still ends up preallocating so much vram (over 2gb) that a very large amount of mainstream customers will fail?


What program do you guys use to make all the graphs?
office (excel)

any spreadsheet app should let you, i'm sure there are also standalone charting tools out there as the raw logged data is the same (time + some value, one on each line)
 
Joined
Apr 19, 2012
Messages
12,062 (2.77/day)
Location
Gypsyland, UK
System Name HP Omen 17
Processor i7 7700HQ
Memory 16GB 2400Mhz DDR4
Video Card(s) GTX 1060
Storage Samsung SM961 256GB + HGST 1TB
Display(s) 1080p IPS G-SYNC 75Hz
Audio Device(s) Bang & Olufsen
Power Supply 230W
Mouse Roccat Kone XTD+
Software Win 10 Pro
What program do you guys use to make all the graphs?

Excel, except the default graphs are horribly ugly. I probably spent the best part of an hour or two making them look not faceless and boring.

I also occasionally use jfiddle and Google jscript to make pretty interactive charts.
 
Last edited:
Joined
Dec 31, 2009
Messages
19,366 (3.72/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
My point is that it is better acheive a given bandwidth with faster ram rather than a wider bus.
How did you come to that conclusion? I mean I see the graphs between the two cards, and the performance is what it is... so, how did you come to that conclusion considering the results?

750 is ok for gaming.. will hurt in some games especially with the 1 gb of vram at 1080p but its pretty good at lower resolutions.
I can make that argument for any card if I play at 640x480 and no AA...

But we are talking majority here who are at 1080p or even 1440x900. 1GB of ram with most any MSAA is going to cripple a 128bit but more specifically a 1GB card.
 
Joined
Dec 29, 2014
Messages
861 (0.25/day)
Because higher ram speed increases FPS even when bandwidth isn't an issue.

The GTX 750 has enough vram for its processing power. 1080p with AA. If you are experiencing low framerates, it won't be because your 1 GB of vram is maxed out.
 
Joined
Dec 31, 2009
Messages
19,366 (3.72/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
But when you run out of ram, it doesn't matter how fast it is. Which is part of my point. 1GB isn't enough in most titles at 1080p with 4xMSAA or greater.

I need AT LEAST 4xMSAA at 1080p to get rid of the jaggies...

This was a test at my home site done in 2012.. Tell me 1GB is enough again after seeing this: http://www.overclockers.com/forums/showthread.php/718118-How-much-GDDR-do-I-need-to-run-my-game

The 750 is a budget card to put an image on the screen to me. It could play most all games at lower than 1080p with some AA well. Otherwise, it sucks and TPU's results also show that for any half modern game (not getting over 30 FPS): https://www.techpowerup.com/reviews/ASUS/GTX_750_OC/7.html

If you are not into AAA titles or can handle less than optimal image quality, it ok.
 
Last edited:
Joined
Dec 29, 2014
Messages
861 (0.25/day)
This was a test at my home site done in 2012.. Tell me 1GB is enough again after seeing this: http://www.overclockers.com/forums/showthread.php/718118-How-much-GDDR-do-I-need-to-run-my-game

"Unless otherwise noted all tests were run with the maximum settings allowed by the game or benchmark. Gpu memory usage was logged with MSI Afterburner. The maximum value from the log file excluding the last measurement of the run is what's posted. Testbed was the 3dvision gaming pc in my sig."


The GTX 750 isn't made to run recent games at max settings and still get good framerates. Is that surprising for a <$100 card? It will even have trouble getting good framerates in some titles with MSAA... also not surprising. But in most games you can get decent framerates with a little AA enabled and it will look good. Not stellar, but good. At no point will the 1GB of ram be your limiting factor.

Most benchmarks use standard (usually max settings) to test cards. This is appropriate for high-end cards. They keep them the same for all cards tested for consistency and to compare performance over all cards, but it isn't realistic for the low end cards. The FPS will be unrealistically low, and the vram usage unrealistically high.
 
Last edited:
Joined
Dec 31, 2009
Messages
19,366 (3.72/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
I don't see much of a point in playing a PC game if I can't use AA and it looks worse than a console. To me, its not remotely an adequate gaming card because of those points.

At no point will the 1GB of ram be your limiting factor.
BOLOGNA.
You have to make it NOT be a factor with that card by sacrificing IQ in modern titles. Nobody, with a half decent budget, would get a 1GB card these days.
 
Last edited:
Joined
Dec 29, 2014
Messages
861 (0.25/day)
BOLOGNA. You have to make it NOT be a factor with that card by sacrificing IQ in modern titles.

No I don't. The IQ is already sacrificed by the card's low processing power. Adding vram wouldn't help in the slightest. You do realize that the shaders, ROPs, and TMUs are basically half that of a 960 and 1/4th that of a 980...? Amazingly the 960 has twice the vram and the 980 has 4x as much... hmmm... could there be a connection?
 
Top