• We've upgraded our forums. Please post any issues/requests in this thread.

GPU-Z readings mismatch

jasonx

New Member
Joined
Nov 8, 2012
Messages
2 (0.00/day)
Likes
0
#1
GPU-Z shows 2 different pixel Fill Rates, between 2 different versions.

from 0.5.7 to 0.6.6 - 10.1 GPixles/s
0.5.6- 23 GPixles/s

which one is correct

 

T4C Fantasy

CPU & GPU DB Maintainer
Joined
May 7, 2012
Messages
1,233 (0.60/day)
Likes
608
Location
Rhode Island
System Name Phantom 820 v3.1
Processor Intel Core i7-6700k @ 4.4GHz
Motherboard ASRock Z170 Formula OC (Bios: 7.40)
Cooling Corsair H115i
Memory Corsair Dominator Platinum 16GB DDR4 3000MHz
Video Card(s) ZOTAC GTX 1070 AMP! / EVGA GTX 1080 Ti SC2
Storage 512GB Crucial MX300 / 256GB OCZ Vertex 4 / 1TB Hitachi HDD
Display(s) 25" ASUS VX248 / 24'' LG DM2350D / 24" LG 24UD58-B 4K
Case NZXT Phantom 820 Ultra+ Tower
Audio Device(s) Logitech G933 Headset
Power Supply SeaSonic Platinum 1050W Snow
Mouse Logitech G900
Keyboard Logitech G910 Orion Spark
Software Windows 10 Pro Build 1703 64-bit
Benchmark Scores Folding PPD: 45,000~ / WCG PPD: 50,000~ (OLD) with HD 7970
#4

jasonx

New Member
Joined
Nov 8, 2012
Messages
2 (0.00/day)
Likes
0
#5
thanx for the answer and the trolling at the same time, i did search maybe i missed the correct search query or look in the wrong sub forum, but anyways thanx
 
Joined
Oct 12, 2008
Messages
5,656 (1.69/day)
Likes
2,606
Location
στο άλφα έως ωμέγα
System Name Ha/AhHa/Dell
Processor QX9650 SLAWN C1/i7-980x/i7-6700K
Motherboard GA-X48_DS4 (F3B bios)/Gigabyte x58A-UDR3 v 2.0(modded FH bios)/Dell Foxconn 0XJ8C4 Z170
Cooling CNPS9900 LED/H60/ 3 pipe-center fan-air
Memory 8 Gig of G.Skill F2-8800CL5D/24 Gb Corsair Vengence/ 24Gb Samsung DDR4 2133
Video Card(s) Galaxy NVIDIA GeForce GTX 960/PowerColor R9 280/ASUS R9 380X Strix G1
Storage All have SSDs with HDDs for extra storage and backup/Dell-M.2 Samsung 850 EVO PCIe
Display(s) Asus 266H/Viewsonic 1080p/HP ZR24W
Case CM-690/CM-690 II adv/Dell 8900 series
Audio Device(s) All use on board (Realtek) w/2.1 speakers
Power Supply PC P&C 750/PC P&C Silencer 950/CM 700 Extreme
Mouse Logitech
Keyboard Logitech
Software Windows 10 Pro - 64 bit/Windows 10 Pro - 64bit/Windows 10 Pro - 64bit
#6
Welcome to TPU, jasonx!

W1zzard gets to troll or anything W1zzard wants to do... it is his site.

However, there are a lot of other Trolls here and you can hit the little triangle, with the exclamation in it, to report them or any violation... the mods here will respond and are very fair.

Again, Welcome, and feel free to contribute.:)
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
17,071 (3.44/day)
Likes
17,986
Processor Core i7-4790K
Memory 16 GB
Video Card(s) GTX 1080
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 7
#7
thanx for the answer and the trolling at the same time, i did search maybe i missed the correct search query or look in the wrong sub forum, but anyways thanx
I must have answered that question like 20 times and keep wondering why people can't find it. no offense, welcome to the forums
 
Joined
Mar 6, 2008
Messages
2,700 (0.76/day)
Likes
1,364
Location
Minnesota
System Name I Dub Thee Infinity
Processor Intel Core I7-3930K
Motherboard EVGA X79 Classified
Cooling Corsair H80
Memory 16GB GSkill Trident X
Video Card(s) EVGA GTX 980 Ti SC+
Storage SanDisk Ultra Plus 256GB, OCZ V2 180GB, 2x Toshiba X300 5TB RAID 0
Display(s) Acer XB270HU
Case Cooler Master HAF X
Audio Device(s) Creative X-Fi Titanium + Sennheiser HD 598 + Klipsch ProMedia 2.1
Power Supply EVGA 850W G2
Mouse Razer Naga 2014
Keyboard Gigabyte Osmium Cherry MX Brown
Software Windows 10 Pro x64
#8
Perhaps add something about it in the fillrate tooltip only for Fermi? Though I'm sure it would go unnoticed by most.
 

Flickspeed

New Member
Joined
Jan 2, 2013
Messages
4 (0.00/day)
Likes
0
#9
Pixel Fillrate Calculation

Hey are you sure you are using the correct way to calculate pixel fillrate in the current versions? I read the other threads and see some inconsistencies.

It seems the pixel fillrate is still not calculated properly for Fermi Cards. Are you taking into account the following information?

Each Streaming Multiprocessor(SM) in the GPU of GF100 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF104/106/108 architecture contains 48 SPs and 8 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF110 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF114/116/118/119 architecture contains 48 SPs and 8 SFUs.

Each SP can fulfill up to two single precision operations FMA per clock. Each SFU can fulfill up to four operations SF per clock. The approximate ratio of operations FMA to operations SF is equal: for GF100 4:1 and for GF104/106/108 3:1. The theoretical shader performance in single-precision floating point operations(FMA) [FLOPSsp, GFLOPS] of the graphics card with shader count [n] and shader frequency [f, GHz], is estimated by the following: FLOPSsp ≈ f × n × 2. Alternative formula: for GF100 FLOPSsp ≈ f × m × (32 SPs × 2(FMA)) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA)). [m] - SM count. Total Processing Power: for GF100 FLOPSsp ≈ f × m ×(32 SPs × 2(FMA) + 4 × 4 SFUs) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA) + 4 × 8 SFUs) or for GF100 FLOPSsp ≈ f × n × 2.5 and for GF104/106/108 FLOPSsp ≈ f × n × 8 / 3.[16] where:
SP - Shader Processor (Unified Shader, CUDA Core), SFU - Special Function Unit, SM - Streaming Multiprocessor, FMA - Fused MUL+ADD.

Based on this information the current calculation method is wrong!
 
Last edited:

T4C Fantasy

CPU & GPU DB Maintainer
Joined
May 7, 2012
Messages
1,233 (0.60/day)
Likes
608
Location
Rhode Island
System Name Phantom 820 v3.1
Processor Intel Core i7-6700k @ 4.4GHz
Motherboard ASRock Z170 Formula OC (Bios: 7.40)
Cooling Corsair H115i
Memory Corsair Dominator Platinum 16GB DDR4 3000MHz
Video Card(s) ZOTAC GTX 1070 AMP! / EVGA GTX 1080 Ti SC2
Storage 512GB Crucial MX300 / 256GB OCZ Vertex 4 / 1TB Hitachi HDD
Display(s) 25" ASUS VX248 / 24'' LG DM2350D / 24" LG 24UD58-B 4K
Case NZXT Phantom 820 Ultra+ Tower
Audio Device(s) Logitech G933 Headset
Power Supply SeaSonic Platinum 1050W Snow
Mouse Logitech G900
Keyboard Logitech G910 Orion Spark
Software Windows 10 Pro Build 1703 64-bit
Benchmark Scores Folding PPD: 45,000~ / WCG PPD: 50,000~ (OLD) with HD 7970
#10
Hey are you sure you are using the correct way to calculate pixel fillrate in the current versions? I read the other threads and see some inconsistencies.

It seems the pixel fillrate is still not calculated properly for Fermi Cards. Are you taking into account the following information?

Each Streaming Multiprocessor(SM) in the GPU of GF100 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF104/106/108 architecture contains 48 SPs and 8 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF110 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF114/116/118/119 architecture contains 48 SPs and 8 SFUs.

Each SP can fulfill up to two single precision operations FMA per clock. Each SFU can fulfill up to four operations SF per clock. The approximate ratio of operations FMA to operations SF is equal: for GF100 4:1 and for GF104/106/108 3:1. The theoretical shader performance in single-precision floating point operations(FMA) [FLOPSsp, GFLOPS] of the graphics card with shader count [n] and shader frequency [f, GHz], is estimated by the following: FLOPSsp ≈ f × n × 2. Alternative formula: for GF100 FLOPSsp ≈ f × m × (32 SPs × 2(FMA)) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA)). [m] - SM count. Total Processing Power: for GF100 FLOPSsp ≈ f × m ×(32 SPs × 2(FMA) + 4 × 4 SFUs) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA) + 4 × 8 SFUs) or for GF100 FLOPSsp ≈ f × n × 2.5 and for GF104/106/108 FLOPSsp ≈ f × n × 8 / 3.[16] where:
SP - Shader Processor (Unified Shader, CUDA Core), SFU - Special Function Unit, SM - Streaming Multiprocessor, FMA - Fused MUL+ADD.

Based on this information the current calculation method is wrong! Please recheck. For example the GTX 460 has 7 SM's for a total of 7*48 = 336 SP's!!!
check out the gpu database, it uses the latest known calculation for Fermi
http://www.techpowerup.com/gpudb/265/NVIDIA_GeForce_GTX_460.html
 
Joined
Mar 6, 2008
Messages
2,700 (0.76/day)
Likes
1,364
Location
Minnesota
System Name I Dub Thee Infinity
Processor Intel Core I7-3930K
Motherboard EVGA X79 Classified
Cooling Corsair H80
Memory 16GB GSkill Trident X
Video Card(s) EVGA GTX 980 Ti SC+
Storage SanDisk Ultra Plus 256GB, OCZ V2 180GB, 2x Toshiba X300 5TB RAID 0
Display(s) Acer XB270HU
Case Cooler Master HAF X
Audio Device(s) Creative X-Fi Titanium + Sennheiser HD 598 + Klipsch ProMedia 2.1
Power Supply EVGA 850W G2
Mouse Razer Naga 2014
Keyboard Gigabyte Osmium Cherry MX Brown
Software Windows 10 Pro x64
#12
Hey are you sure you are using the correct way to calculate pixel fillrate in the current versions? I read the other threads and see some inconsistencies.

It seems the pixel fillrate is still not calculated properly for Fermi Cards. Are you taking into account the following information?

Each Streaming Multiprocessor(SM) in the GPU of GF100 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF104/106/108 architecture contains 48 SPs and 8 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF110 architecture contains 32 SPs and 4 SFUs.
Each Streaming Multiprocessor(SM) in the GPU of GF114/116/118/119 architecture contains 48 SPs and 8 SFUs.

Each SP can fulfill up to two single precision operations FMA per clock. Each SFU can fulfill up to four operations SF per clock. The approximate ratio of operations FMA to operations SF is equal: for GF100 4:1 and for GF104/106/108 3:1. The theoretical shader performance in single-precision floating point operations(FMA) [FLOPSsp, GFLOPS] of the graphics card with shader count [n] and shader frequency [f, GHz], is estimated by the following: FLOPSsp ≈ f × n × 2. Alternative formula: for GF100 FLOPSsp ≈ f × m × (32 SPs × 2(FMA)) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA)). [m] - SM count. Total Processing Power: for GF100 FLOPSsp ≈ f × m ×(32 SPs × 2(FMA) + 4 × 4 SFUs) and for GF104/106/108 FLOPSsp ≈ f × m × (48 SPs × 2(FMA) + 4 × 8 SFUs) or for GF100 FLOPSsp ≈ f × n × 2.5 and for GF104/106/108 FLOPSsp ≈ f × n × 8 / 3.[16] where:
SP - Shader Processor (Unified Shader, CUDA Core), SFU - Special Function Unit, SM - Streaming Multiprocessor, FMA - Fused MUL+ADD.

Based on this information the current calculation method is wrong! Please recheck. For example the GTX 460 has 7 SM's for a total of 7*48 = 336 SP's!!!
What the hell are you going on about? This thread is about pixel fill rate not shader count. You are obviously trying to figure floating point performance. That is entirely out of the scope of this thread.
 

Flickspeed

New Member
Joined
Jan 2, 2013
Messages
4 (0.00/day)
Likes
0
#13
What the hell are you going on about? This thread is about pixel fill rate not shader count. You are obviously trying to figure floating point performance. That is entirely out of the scope of this thread.
Maban you can disregard anything after the last sentence in red.

I am just making corrections to the following post which wizzard made a base for calculations. It has a fundental error.

The pixel fillrate in GPU-Z is displayed wrong for Nvidia Fermi based graphics cards. The pixel fillrate seems to be calculated by multiplying the number of ROPs and the GPU clock. But in case of Fermi gpus the pixel fillrate is generally not limited by the number of ROPs but by the number of streaming multiprocessors. Each streaming multiprocessor is capable of processing two pixels per clock. So if there are 16 SMs and 48 ROPs like in the GeForce GTX 580, the SMs limit the pixel fillrate. This is the case for all Fermi based graphics cards i know.
Having more ROPs than pixels that can be processed per clock help to sustain a high pixel fillrate when using multiple samples per pixel (i.e. multisampling antialiasing) but the peak pixel fillrate is limited by the stream processors.
Check out these benchmarks by hardware.fr (scroll down to section 'Fillrate'): http://www.hardware.fr/articles/806-4/nvidia-geforce-gtx-580-sli.html.
The measured peak pixel fillrate of the GeForce GTX 580 is 23,3 GPixel/s. Simply multiplying the 48 ROPs with the 772 MHz gpu clock would give you a peak pixel fillrate of 37,1 GPixel/s. But as the pixel fillrate is limited by the streaming multiprocessors, the peak fillrate is only 16*2*772 MPixel/s = 24,7 GPixel/s. This number corresponds well to the measurement taken by hardware.fr.
If you look at non fermi graphics cards you will see that the measured peak pixel fillrate corresponds well to the product of number of ROPs and gpu clock.

Many reviews cite the wrong peak pixel fillrate for Fermi cards and Nvidia doesn't publish pixel fillrate numbers on the product pages. But knowing the Fermi architectural properties you can easily calculate the right peak pixel fillrate. I hope that GPU-Z will be fixed in a way to show the right peak pixel fillrate on Nvidia Fermi graphics cards.
It is not the SM that is limiting anything :):) Thats the fundamental error I marked it in red ;) So at the end of the day, it is still ROPs times MHz. Prove me wrong and give me the source saying an SM can only process 2 Pixels per clock.
 
Last edited:
Joined
Mar 6, 2008
Messages
2,700 (0.76/day)
Likes
1,364
Location
Minnesota
System Name I Dub Thee Infinity
Processor Intel Core I7-3930K
Motherboard EVGA X79 Classified
Cooling Corsair H80
Memory 16GB GSkill Trident X
Video Card(s) EVGA GTX 980 Ti SC+
Storage SanDisk Ultra Plus 256GB, OCZ V2 180GB, 2x Toshiba X300 5TB RAID 0
Display(s) Acer XB270HU
Case Cooler Master HAF X
Audio Device(s) Creative X-Fi Titanium + Sennheiser HD 598 + Klipsch ProMedia 2.1
Power Supply EVGA 850W G2
Mouse Razer Naga 2014
Keyboard Gigabyte Osmium Cherry MX Brown
Software Windows 10 Pro x64
#14
Maban you can disregard anything after the last sentence in red.

I am just making corrections to the following post which wizzard made a base for calculations. It has a fundental error.



It is not the SM that is limiting anything :):) Thats the fundamental error I marked it in red ;) So at the end of the day, it is still ROPs times MHz. Prove me wrong and give me the source saying an SM can only process 2 Pixels per clock.
Read the white papers.
 

Flickspeed

New Member
Joined
Jan 2, 2013
Messages
4 (0.00/day)
Likes
0
#15
Read the white papers.
Show me where is says an SM is limited to two pixels per clock in the white papers. Links please.....

I don't work for nvidia and I am not an nvidia fan boy, I am just not liking the fact of misreporting theoretical pixel fillrate values without any valid proof. The start would be to prove that an SM can only do 2 pixels per clock, I couldn't find this anywhere on www except in some post here :)

Maban instead of posting useless comments you can start here: http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIAFermiComputeArchitectureWhitepaper.pdf :) good luck and have fun.

Also if there is no proof I would like to ask w1zzard to make GPU-Z calculate Theoretical Pixel Fillrates based on the old formula.
 
Last edited: