CynicalCyanide
New Member
- Joined
- Dec 17, 2015
- Messages
- 2 (0.00/day)
Hi.
First time poster, long time reader here. I've loved and used Techpowerup's reviews on GPUs, CPUs etc for many years now, and the data from this website has been invaluable for constructing my own guide to picking PC hardware for full builds (although I won't link it because of the Forum Guidelines, it currently sits on my own website and also on another forum with almost 200,000 views and 2,000 replies on the latter alone). You'll just have to believe me when I say that I spend probably 10 hours a week pouring over PC Hardware product reviews.
I mention all of this because I've noticed a couple of things that other review sites have done a bit better than TPU, which is a shame because TPU has by far the best and most consistent format which makes it easy to get straight into the data, with particular kudos for providing summary pages.
However, I think that only displaying average FPS creates a lack of a really basic feature in a world where the majority of the larger tech sites have even gone a step further than just providing average + minimum FPS by providing detailed FCAT data and/or measures of the 0.1% & 0.01% lowest framerates. Furthermore, while I understand that a 'flat' and minimalist graphic atheistic is all the rage these days (though personally I definitely dislike it) - Did we really need to remove the AA settings from the text at the top of the graphs?
While I presume that the AA settings remain the same as in previous reviews with the old graphical style, a new and/or niave reader might see the "All games are set to their highest quality setting unless indicated otherwise." line in the 'Test System' page and assume that AA is maxed out for every graph. Or that there is no AA being used at all (which would also be pretty silly, since who on earth with a high end GPU doesn't play with AA turned on at 1080p, or even 2560x1440?)
A smaller, perhaps pedantic bit of feedback: The resolution charts are descending from left to right, top to bottom. But on the Summary page, it's reversed. Probably should keep it consistent instead.
I'm also not sure if there's a particular objection to hosting a 'synthetic' benchmark such as 3DMark, but I think having one might be perhaps more valuable than at least one of the large selection of games in the test suite (and at least one of which I'm sure will turn out to be flavour of the month games and quickly die off in popularity). One that comes to in mind is Futuremark's upcoming DX12 "Time Spy" benchmark, which would kill two birds with one stone by also adding a DX12 benchmark. Personally I find synthetic benchmarks easier to compare and contrast, especially against my own hardware here at home. If you were to choose a synthetic benchmark and test a mode which is free for users, then that would also provide an opportunity for them to make sure their (and your!) machines are running as expected, and/or conveniently measure up their own hardware against that which is being reviewed.
My last idea is perhaps the most work intensive, but perhaps also the most interesting. The basic premise is a 'bias' measurement for games. In short, the results of all hardware of a single Architecture is averaged and compared on a game-by-game basis to the overall 'summary' performance of that Architecture. The explanation for how it would be done is a little confusing, so bear with me.
To explain by example: Say the average performance for all the GPUs in a given game (Game 'X') is 50FPS. The average performance across all games is 100FPS. The baseline for Game X would then be 50/100 = 50%. Then, if all GCN (1.0) based cards perform at an average of 60FPS for game X, but perform at an average of 90FPS across all games, then they would get a figure of 60/90 = 66.66% for Game X, which would then be compared to the baseline (which is normalised to 100%). Thus, with the baseline as 50%, normalised to 100%, you would get 133.33% in game X for GCN 1.0, which would indicate that the game is strongly favourably biased towards that particular Architecture (i.e. 1.33x faster than otherwise expected). Mind you, each resolution would need it's own set of calculations. As long as you're storing your results in some sane format (excel tables would do fine) it should be pretty easy to setup a formula and let it do its thing with minimal fuss.
Assuming I did my rough maths right, I tried this technique out on the reference 980 Ti results from the recent Gigabyte 980 Ti Waterforce review, and picked the 4K Civ 5: Beyond Earth benchmark as 'Game X'.
The Civ 5 result for the 980 Ti was 59.7 FPS.
The Civ 5 average result was 61.99 FPS.
The 980 Ti average across all games (summary) was 79%.
The average result (for the same set of cards tested in Civ 5) across the summary was 77.42%.
Therefore, taking the 980 Ti's Civ 5 result and dividing it by the average Civ 5 result:
59.7 / 61.99 = 96.31% as good as the average.
And taking the summary result for the 980 Ti and dividing it by the average summary result:
79% / 77.42% = 102.04% better than average.
i.e. We would expect the 980 Ti to perform ~2% better than the average in Civ 5, but it performed WORSE than the average Civ 5 result. A better way to show the result is by first obtaining the baseline relative performance (Civ 5 vs. Summary), then comparing it to the card's relative performance. I've thrown the Fury X results in here for comparison:
Baseline: 61.99 FPS / 77.42% = 80.01%
980 Ti: 59.7 FPS / 79% = 75.57%
Fury X: 71.4 FPS / 83% = 86.02%
Then, normalising the baseline (80.01%) to 100%, we get:
Baseline: 100%.
980 Ti: 94.37%. Or, ~6% slower than expected (judging by the summary results).
Fury X: 107.4 %. Or, ~7% faster than expected.
In conclusion, for this particular example Civ 5 was a bit kind to the Fury X, and a bit cruel to the 980 Ti.
Although I didn't have time, you could combine and average all the cards from the same Arch first, rather than averaging the entire list of cards. This prevents the amount of cards influencing the result (i.e. if you have many cards that are favourably biased in a game's benchmark list, then that will unfairly inflate the average score for the game relative to the summary score, and therefore exaggerate the apparent bias). The amount of cards from each arch in the Civ 5 example was fairly even, so I didn't bother. You could probably just keep it simple by broadly grouping everything into 'Maxwell', 'GCN', 'Kepler' etc if need be. Of course, there's nothing stopping you from using the same method on CPUs or whatever else as well.
This information could be handy for identifying which games are outliers in terms of performance bias, as well as looking at potential improvements (or 'fixes') from patches and driver updates over time. It's upto you how you would display this information, whether on a separate page or in-line on each bar in the graph or whatever.
While I know these ideas would probably be a lot of work, I think they would provide the best, most useful & thorough data, and the bias analysis is interesting enough to maybe even catch on I think.
Let me know what y'all think!
First time poster, long time reader here. I've loved and used Techpowerup's reviews on GPUs, CPUs etc for many years now, and the data from this website has been invaluable for constructing my own guide to picking PC hardware for full builds (although I won't link it because of the Forum Guidelines, it currently sits on my own website and also on another forum with almost 200,000 views and 2,000 replies on the latter alone). You'll just have to believe me when I say that I spend probably 10 hours a week pouring over PC Hardware product reviews.
I mention all of this because I've noticed a couple of things that other review sites have done a bit better than TPU, which is a shame because TPU has by far the best and most consistent format which makes it easy to get straight into the data, with particular kudos for providing summary pages.
However, I think that only displaying average FPS creates a lack of a really basic feature in a world where the majority of the larger tech sites have even gone a step further than just providing average + minimum FPS by providing detailed FCAT data and/or measures of the 0.1% & 0.01% lowest framerates. Furthermore, while I understand that a 'flat' and minimalist graphic atheistic is all the rage these days (though personally I definitely dislike it) - Did we really need to remove the AA settings from the text at the top of the graphs?
While I presume that the AA settings remain the same as in previous reviews with the old graphical style, a new and/or niave reader might see the "All games are set to their highest quality setting unless indicated otherwise." line in the 'Test System' page and assume that AA is maxed out for every graph. Or that there is no AA being used at all (which would also be pretty silly, since who on earth with a high end GPU doesn't play with AA turned on at 1080p, or even 2560x1440?)
A smaller, perhaps pedantic bit of feedback: The resolution charts are descending from left to right, top to bottom. But on the Summary page, it's reversed. Probably should keep it consistent instead.
I'm also not sure if there's a particular objection to hosting a 'synthetic' benchmark such as 3DMark, but I think having one might be perhaps more valuable than at least one of the large selection of games in the test suite (and at least one of which I'm sure will turn out to be flavour of the month games and quickly die off in popularity). One that comes to in mind is Futuremark's upcoming DX12 "Time Spy" benchmark, which would kill two birds with one stone by also adding a DX12 benchmark. Personally I find synthetic benchmarks easier to compare and contrast, especially against my own hardware here at home. If you were to choose a synthetic benchmark and test a mode which is free for users, then that would also provide an opportunity for them to make sure their (and your!) machines are running as expected, and/or conveniently measure up their own hardware against that which is being reviewed.
My last idea is perhaps the most work intensive, but perhaps also the most interesting. The basic premise is a 'bias' measurement for games. In short, the results of all hardware of a single Architecture is averaged and compared on a game-by-game basis to the overall 'summary' performance of that Architecture. The explanation for how it would be done is a little confusing, so bear with me.
To explain by example: Say the average performance for all the GPUs in a given game (Game 'X') is 50FPS. The average performance across all games is 100FPS. The baseline for Game X would then be 50/100 = 50%. Then, if all GCN (1.0) based cards perform at an average of 60FPS for game X, but perform at an average of 90FPS across all games, then they would get a figure of 60/90 = 66.66% for Game X, which would then be compared to the baseline (which is normalised to 100%). Thus, with the baseline as 50%, normalised to 100%, you would get 133.33% in game X for GCN 1.0, which would indicate that the game is strongly favourably biased towards that particular Architecture (i.e. 1.33x faster than otherwise expected). Mind you, each resolution would need it's own set of calculations. As long as you're storing your results in some sane format (excel tables would do fine) it should be pretty easy to setup a formula and let it do its thing with minimal fuss.
Assuming I did my rough maths right, I tried this technique out on the reference 980 Ti results from the recent Gigabyte 980 Ti Waterforce review, and picked the 4K Civ 5: Beyond Earth benchmark as 'Game X'.
The Civ 5 result for the 980 Ti was 59.7 FPS.
The Civ 5 average result was 61.99 FPS.
The 980 Ti average across all games (summary) was 79%.
The average result (for the same set of cards tested in Civ 5) across the summary was 77.42%.
Therefore, taking the 980 Ti's Civ 5 result and dividing it by the average Civ 5 result:
59.7 / 61.99 = 96.31% as good as the average.
And taking the summary result for the 980 Ti and dividing it by the average summary result:
79% / 77.42% = 102.04% better than average.
i.e. We would expect the 980 Ti to perform ~2% better than the average in Civ 5, but it performed WORSE than the average Civ 5 result. A better way to show the result is by first obtaining the baseline relative performance (Civ 5 vs. Summary), then comparing it to the card's relative performance. I've thrown the Fury X results in here for comparison:
Baseline: 61.99 FPS / 77.42% = 80.01%
980 Ti: 59.7 FPS / 79% = 75.57%
Fury X: 71.4 FPS / 83% = 86.02%
Then, normalising the baseline (80.01%) to 100%, we get:
Baseline: 100%.
980 Ti: 94.37%. Or, ~6% slower than expected (judging by the summary results).
Fury X: 107.4 %. Or, ~7% faster than expected.
In conclusion, for this particular example Civ 5 was a bit kind to the Fury X, and a bit cruel to the 980 Ti.
Although I didn't have time, you could combine and average all the cards from the same Arch first, rather than averaging the entire list of cards. This prevents the amount of cards influencing the result (i.e. if you have many cards that are favourably biased in a game's benchmark list, then that will unfairly inflate the average score for the game relative to the summary score, and therefore exaggerate the apparent bias). The amount of cards from each arch in the Civ 5 example was fairly even, so I didn't bother. You could probably just keep it simple by broadly grouping everything into 'Maxwell', 'GCN', 'Kepler' etc if need be. Of course, there's nothing stopping you from using the same method on CPUs or whatever else as well.
This information could be handy for identifying which games are outliers in terms of performance bias, as well as looking at potential improvements (or 'fixes') from patches and driver updates over time. It's upto you how you would display this information, whether on a separate page or in-line on each bar in the graph or whatever.
While I know these ideas would probably be a lot of work, I think they would provide the best, most useful & thorough data, and the bias analysis is interesting enough to maybe even catch on I think.
Let me know what y'all think!
Last edited: