Texture Fill/Bandwidth guide?

newconroer · Nov 12, 2007

Where can I find a guide or write-up that talks in-depth about the relevance of Texture Fill Rates and memory bandwidth for GPU architecture?

a111087 · Nov 13, 2007

http://www.gpureview.com/texture-fill-rate-article-375.html

http://www.gpureview.com/memory-bandwidth-article-356.html

Hope that helps!

DarkMatter · Nov 13, 2007

Wow! That's a good question!
I've been looking for that for a long time and didn't find anything that fully explains the relevance of the different parts of graphics architecture. Those links at least tell some basics, but I fear that they only scratch the surface. I know everything that a person outside Nvidia or Ati needs to know about the technology behind them, but knowing to wich extent they are relevant to performance (I think you are looking for this), is a different story. Basically all of them are equally relevant, but always one of them is going to be the bottleneck, so balance is the answer. A really difficult answer as you can take out from the fact that Ati and Nvidia chose so different paths, and change them constantly. So for every generation and architecture proposition, the part that breaks the balance first, becomes the bottleneck, and thus the more relevant one, the one they must evolve to get better performance.

Also, it is important to note, that the relevance you are refering to, is tied to the use you (or game developers) are going to give to the graphics card. For example memory bandwith gets more relevant once we start to crank up the resolution, AA and AF. Despite this, I would say that today memory bandwith is one of the less relevant things, because it has exceeded it needs (always talking about performance parts, because DDR2 is most times than not insufficient). A proof of this is the path that Nvidia has taken with the 8800GT, and Ati with the HD3800 series. They have nearly half the bandwith and less pixel fillrate, still they perform as well or better.

Until the 8800GT came out and I saw some benchmarks, I used to consider texture fillrate as the more relevant one, with shader power following close. Now I think shaders are the more relevant. That is pretty obvious to me, if you make the math (SPs x shader clock) for the GF8800 series and compare them to the actual performance of the cards on the benchies. They completely match. Still since on GF8 series shaders and TMUs are grouped on "quads" so for more SPs they have proportionally more TMUs, its hard to tell if bottleneck happens on SPs or textures. Since 8800GT arrived I place my bet on SPs, because where G80 had 1 Texture Address Unit for every 2 Texture Mapping Units, the G92 has double the Address units, and there isn't a significant jump on performance compared to G80. I expected this jump, since even if peak fill rate remains the same for every quad, the added addressing power should make the average a lot higher, and thus increase performance if this was the bottleneck.

Summarizing, the relevance of different parts as I see it is as follows:

SPs power > Texture Fill Rate >>> Memory bandwith > Pixel Fillrate* > Memory amount**

* I place it the least relevant because at around 12GP/s (2900XT, 8800 GTX/Ultra) you have enough power to reach as high as 1920x1200 16X AA/16X AF without bottlenecking and I don't see the point of going much higher on the future, but I do know that higher memory bandwith is going to be needed on the future, just not as much as they first thought when they jumped to 512bit or maybe even 384bit. I'm 99,9% sure that they made the jump because of marketing though.

** And yeah, the least relevant is memory quantity, as long as you have a minimum.

That is the World of Graphics Computing Power Relevancies from my point of view. Hope it helps you make an overall idea on how that works, although I wouldn't take it too seriously, because I have been (alongside with TPU, Tom's, Anand... wiki... :eek:

yeah, whatever...) my own teacher on this...

See you! :toast:

EDIT: When I said "G92 has double the Address units" I meant it has the same number of TMUs and TAUs, compared to half the TAUs versus TMUs on G80.

newconroer · Nov 13, 2007

Yes, thanks very much to both of you.

I'm trying to determine if when overclocking a GTX to Ultra specs, the fill rates and bandwidth can be matched, but, is performance really identical given the revision of the cores.

I figured knowing more about these two variables of GPU architecture might shed some light on the matter.

a111087 · Nov 14, 2007

on http://www.gpureview.com/ you can actually "OC" video card and compare them
but all the numbers are theoretical of course

newconroer · Nov 14, 2007

Ya, still neat though.

Check this out :

According to their calculations, the fill rates and bandwidth would be matched, BUT, the shader operations would be in favor of the Ultra, as it comes stock at 1500mhz.

However, in order to achieve matching shader clocks with the GTX, while linked (obviously you can unlink your own card) it would require a core of 725.

Does the GTX even go that high?
EDIT: The site doesn't scale the shader clocks with the core quite properly. It seems off by about 80-100hz, but close enough.

GTX:
Core Clock: 725 MHz
Shader Clock: 1500 MHz
Memory Clock: 1080 MHz (2160 DDR)
Memory Bandwidth: 103.7 GB/sec
Shader Operations: 192000 Operations/sec
Pixel Fill Rate: 17400 MPixels/sec
Texture Fill Rate: 23200 MTexels/sec

Ultra *stock*
Core Clock: 612 MHz
Shader Clock: 1500 MHz
Memory Clock: 1080 MHz (2160 DDR)
Memory Bandwidth: 103.68 GB/sec
Shader Operations: 192000 Operations/sec
Pixel Fill Rate: 14688 MPixels/sec
Texture Fill Rate: 19584 MTexels/sec

a111087 · Nov 14, 2007

if you think you can oc GTX that far, then go for it

but Ultra can probably go even farther

newconroer · Nov 14, 2007

Aye, the Ultra will go farther, but this helped me find the main difference between the two.

System Name	Widow
Processor	Ryzen 7600x
Motherboard	AsRock B650 HDVM.2
Cooling	CPU : Corsair Hydro XC7 }{ GPU: EK FC 1080 via Magicool 360 III PRO > Photon 170 (D5)
Memory	32GB Gskill Flare X5
Video Card(s)	GTX 1080 TI
Storage	Samsung 9series NVM 2TB and Rust
Display(s)	Predator X34P/Tempest X270OC @ 120hz / LG W3000h
Case	Fractal Define S [Antec Skeleton hanging in hall of fame]
Audio Device(s)	Asus Xonar Xense with AKG K612 cans on Monacor SA-100
Power Supply	Seasonic X-850
Mouse	Razer Naga 2014
Software	Windows 11 Pro
Benchmark Scores	FFXIV ARR Benchmark 12,883 on i7 2600k 15,098 on AM5 7600x

Processor	Intel Q9400
Motherboard	asus p5q-pro
Cooling	Ultra120
Memory	6GB ddr2
Video Card(s)	NVS 290
Storage	3TB + 1.5TB
Display(s)	Samsung F2380
Case	Silverstone Fortress FT02B
Audio Device(s)	Creative X-Fi
Power Supply	750W PC P&C
Software	win 7 ultimate 64bit

Processor	Intel C2Q Q6600 @ Stock (for now)
Motherboard	Asus P5Q-E
Cooling	Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory	4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s)	GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage	74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s)	HP p1130, 21" Trinitron
Case	Antec p180
Audio Device(s)	Creative X-Fi PLatinum
Power Supply	700W FSP Group 85% Efficiency
Software	Windows XP

System Name	Widow
Processor	Ryzen 7600x
Motherboard	AsRock B650 HDVM.2
Cooling	CPU : Corsair Hydro XC7 }{ GPU: EK FC 1080 via Magicool 360 III PRO > Photon 170 (D5)
Memory	32GB Gskill Flare X5
Video Card(s)	GTX 1080 TI
Storage	Samsung 9series NVM 2TB and Rust
Display(s)	Predator X34P/Tempest X270OC @ 120hz / LG W3000h
Case	Fractal Define S [Antec Skeleton hanging in hall of fame]
Audio Device(s)	Asus Xonar Xense with AKG K612 cans on Monacor SA-100
Power Supply	Seasonic X-850
Mouse	Razer Naga 2014
Software	Windows 11 Pro
Benchmark Scores	FFXIV ARR Benchmark 12,883 on i7 2600k 15,098 on AM5 7600x

Processor	Intel Q9400
Motherboard	asus p5q-pro
Cooling	Ultra120
Memory	6GB ddr2
Video Card(s)	NVS 290
Storage	3TB + 1.5TB
Display(s)	Samsung F2380
Case	Silverstone Fortress FT02B
Audio Device(s)	Creative X-Fi
Power Supply	750W PC P&C
Software	win 7 ultimate 64bit

Texture Fill/Bandwidth guide?

New Member