• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Is CPU game benchmarking methodology (currently) flawed?

I'd say it's not completely flawed but a bit. In the end seeing future by yourself instead of forcing it by low resolution is what made the difference. Reality is worth more than a simulation by low resolution. The FX 8350 sure has more power than the 2500k, but only if it's used well, and exactly that needed time. It's more on par with the 2600K than the 2500K.

BTW about the offtopic blatter:
He said he talks in a more extreme accent in his videos not that he "fakes" anything - also it's of no importance and consequence related to the topic, haters gonna hate anyway.
Its been how many years now?
How long should i wait for a cpu to be effectively used? 12, 20 years? :rolleyes:

The Battlefield video he uses is from an amd shill so clearly that's not working to his favor.

Especially since he said they were friends or he knew him well something like that if i remember correctly (on mobile cant check now)
 
Its been how many years now?
How long should i wait for a cpu to be effectively used? 12, 20 years? :rolleyes:

The Battlefield video he uses is from an amd shill so clearly that's not working to his favor.

Especially since he said they were friends or he knew him well something like that if i remember correctly (on mobile cant check now)
Depends who's using it ,and on what ,mines been flat out the 4 years I've owned it , again yawn.
Hyperbole eh 12-20 years ffs really are you 5
 
There seems to be some confusion here: i'm asking about the methodology!

The way the current methodology works is that you test CPUs with a very fast GPU @ low resolutions / details (to eliminate the GPU as a variable) in a variety of gaming scenarios: this will tell if CPU X is better then CPU Y in gaming (and which ever other CPUs included in the review) and no matter what faster card you test again in the future will not change this outcome.

HOWEVER
, Adored has found this not the case since the example he showed shows a FX8350 going from over 10% slower to 10% faster and this contradicts the methodology's theory.

There's a BIG catch however, which is what i was trying to get tested: there were changes in the hardware used as well as drivers and even games (didn't mention the games bit in the OP: that's IMO a very BIG variable).

What i was trying to get answered is if his findings are still true once eliminating as many variables as possible.

This has serious implications because, if a proper review shows Adored's right, then the methodology's is flawed and needs to be scrapped.
 
You know, all these gaming benchmarks are ultimately pointless.

What you really need is a benchmark that displays raw CPU powers in single and multi thread - which is Cinebench.

That's really all you need. I can make decisions based on that single bench. No need to do all these gaming benchmarks that differ one from another like watching Sun and Moon.
 
There seems to be some confusion here: i'm asking about the methodology!

The way the current methodology works is that you test CPUs with a very fast GPU @ low resolutions / details (to eliminate the GPU as a variable) in a variety of gaming scenarios: this will tell if CPU X is better then CPU Y in gaming (and which ever other CPUs included in the review) and no matter what faster card you test again in the future will not change this outcome.

HOWEVER
, Adored has found this not the case since the example he showed shows a FX8350 going from over 10% slower to 10% faster and this contradicts the methodology's theory.

There's a BIG catch however, which is what i was trying to get tested: there were changes in the hardware used as well as drivers and even games (didn't mention the games bit in the OP: that's IMO a very BIG variable).

What i was trying to get answered is if his findings are still true once eliminating as many variables as possible.

This has serious implications because, if a proper review shows Adored's right, then the methodology's is flawed and needs to be scrapped.

The assumptions are that 1)the game will scale into high rez very much the same way as it performs at low rez; therefore testing at low rez will CPU bottleneck and more clearly show the differences. And 2) future GPU will scale the way that a current GPU performs at low rez... i.e. the 2080 that you will pop in your rig will behave the same way at 1440P that your current1080 behaves at 1080P --

If that assumption breaks (i.e. the performance scaling doesn't hold and one proc actually ends up falling off more slowly than the other, and they change places) then the current methodology is flawed, and we should test at all resolutions (some sites do this).

However, the widely accepted belief is that low rez just highlights the bottlenecks of the CPU which will still be somewhat present in the High rez tests, and also with future GPUs.

For a while when sandy bridge first came out that CPU was became almost irrelevant for games... you could get the cheapest i5 quad and the cheapest ddr3 kit and it wouldn't matter - everything was GPU bound above 720 P. Then CPUs scaled at (+)5-10% per generation and GPUs scaled at (+) 60-80%. I just read a 1080TI review where they were seeing bottlenecking on an overclocked skylake at 1440P... so the low rez tests are not all that trivial ..
 
Depends who's using it ,and on what ,mines been flat out the 4 years I've owned it , again yawn.
Hyperbole eh 12-20 years ffs really are you 5
old enough to call out bs :rolleyes:
 
old enough to call out bs :rolleyes:
I have not waffled any so who's , the op.
Either way our opinions differ I'm fine with that.
 
Bottom of the line is that Ryzen clearly has a lot more power left in it while Kaby Lake dose not , this is something important that should be included as a side note alongside with whatever testing methodology you use.

Remember the q6600 and other first quad core cpus ? They were complete pointless at the time they were released for gaming , thats what a lot people said and advised strongly to buy a dual core instead. Today a q6600 runs GTA 5 decently with a good GPU and with most dual cores that can be much higher clocked from the same era it is literally unplayable , no testing methodology told us that.

So to this :

How long should i wait for a cpu to be effectively used? 12, 20 years? :rolleyes:

Yes it seems like it really took about 10 years to become relevant but it did eventually happen. Future proofing is a real thing with CPUs and people care about it , or they should , since not everyone can afford to simply buy whatever it is that performs best each generation.

So shockingly I would take a 1700 over a 7700 any day knowing that today I get 10 less frames in a game that already runs at 100fps but a couple of years in the future performance on the 7700 might be seriously crippled. Again , no current testing methodology says this.

So don't just take for granted every chart you see if you really care about this stuff and make some predictions yourself. Whatever insanely precise way of testing CPU performance you find out is only going to do it for the type of software you run today , so yes it is a flawed way of doing things.
 
Last edited:
Bottom of the line is that Ryzen clearly has a lot more power left in it while Kaby Lake dose not , this is something important that should be included as a side note alongside with whatever testing methodology you use.


Meh. AMD said that bulldozer would do 5 GHz, and that only took like 4 years, so it took a long time for AMD to capitalize on that architecture, and I do not expect AMD to be able to capitalize on Ryzen any faster. So, that "potential" is meaningless. IF it was an actual priority for AMD, they'd have dealt with it prior to the launch, but they failed to do so. You can't tell me they had chips, tested them, knew they worked, saw the performance, and then unintentionally ignored the issues... they ignored the issues on purpose. I fully expect them to keep ignoring them, too.
 
Meh. AMD said that bulldozer would do 5 GHz, and that only took like 4 years, so it took a long time for AMD to capitalize on that architecture, and I do not expect AMD to be able to capitalize on Ryzen any faster. So, that "potential" is meaningless. IF it was an actual priority for AMD, they'd have dealt with it prior to the launch, but they failed to do so. You can't tell me they had chips, tested them, knew they worked, saw the performance, and then unintentionally ignored the issues... they ignored the issues on purpose. I fully expect them to keep ignoring them, too.

I was referring simply to what the chip can do , not it's issues . And I'm afraid it's not in their power to capitalize on what Ryzen can do from that point of view (at least not entirely) , you didn't seriously expect them run up to every software developer and make them rewrite their stuff just for their 1 product did you ? So sure , potential extra power it's meaningless today , but you can never know when the the industry will take big turn and they'll prove useful , that's why I said this should be mentioned as a side note not an actual guideline.
 
I was referring simply to what the chip can do , not it's issues . And I'm afraid it's not in their power to capitalize on what Ryzen can do from that point of view (at least not entirely) , you didn't seriously expect them run up to every software developer and make them rewrite their stuff just for their 1 product did you ? So sure , potential extra power it's meaningless today , but you can never know when the the industry will take big turn and they'll prove useful , that's why I said this should be mentioned as a side note not an actual guideline.
Not every dev, but Microsoft, for Windows10? Yeah, I expect them to deal with such issues prior to a launch.

I don't care about "maybes", I care about what you get for sure.
 
Not every dev, but Microsoft, for Windows10? Yeah, I expect them to deal with such issues prior to a launch.

Well after looking more into this , it seems there isn't much MS can do about it , every weak spot Zen has it's due to it's nature not because of what the OS is doing. All there is left is for developers to take these into account. Even if they don't there is still room for more performance to be had.

I don't care about "maybes", I care about what you get for sure.

That's a perfectly fine way of looking at this matter , but I for one would much rather pick a product that's good enough today and has some "maybes" than one that has none as I would like not to be forced to upgrade it sooner.
 
Well after looking more into this , it seems there isn't much MS can do about it , every weak spot Zen has it's due to it's nature not because what the OS is doing. All there is left is for developers to take these into account. Even if they don't there is still room for more performance to be had.

Says who? Not AMD... they say everything is working fine. Like I said, they are purposefully ignoring any present "issues".



That's a perfectly fine way of looking at this matter , but I for one would much rather pick a product that's good enough today and has some "maybes" than one that has none as I would like not to be forced to upgrade it sooner.

I can understand that opinion, and you holding it, but having been let down by AMD's "promises" time and again over the years has me quite hesitant to believe there will be any improvement.


You see, I already have Ryzen and boards and memory... actual AM4-rated memory... I have a system sitting here next to me running benchmarks as we speak for a board review.


And I see nothing wrong with Ryzen as is. So there is nothing left to improve upon. Ryzen is quite good. IS it a bit disappointing? Not to me. It is EXACTLY what I expected. Did the hype train kill the public's perception of Ryzen? Yeah, it did, and in a big way, but not my perception.
 
Says who? Not AMD... they say everything is working fine. Like I said, they are purposefully ignoring any present "issues".





I can understand that opinion, and you holding it, but having been let down by AMD's "promises" time and again over the years has me quite hesitant to believe there will be any improvement.


You see, I already have Ryzen and boards and memory... actual AM4-rated memory... I have a system sitting here next to me running benchmarks as we speak for a board review.


And I see nothing wrong with Ryzen as is. So there is nothing left to improve upon. Ryzen is quite good. IS it a bit disappointing? Not to me. It is EXACTLY what I expected. Did the hype train kill the public's perception of Ryzen? Yeah, it did, and in a big way, but not my perception.

To be fair all of these discussions could have been avoided if AMD never said the word "gaming" and never showed any "live gaming benchmark\comparisons" . That would have not let the hype grow as big as it did. So I condemn them too for some of these promises , they've always been dumb in this regard and never learned. I cannot recall a truly crap product from them , but all the BS they said did make it look like crap often to the public. I have learned to ignore this BS as well and take it for what it is.
 
Last edited:
There seems to be some confusion here: i'm asking about the methodology!

The way the current methodology works is that you test CPUs with a very fast GPU @ low resolutions / details (to eliminate the GPU as a variable) in a variety of gaming scenarios: this will tell if CPU X is better then CPU Y in gaming (and which ever other CPUs included in the review) and no matter what faster card you test again in the future will not change this outcome.

HOWEVER
, Adored has found this not the case since the example he showed shows a FX8350 going from over 10% slower to 10% faster and this contradicts the methodology's theory.

There's a BIG catch however, which is what i was trying to get tested: there were changes in the hardware used as well as drivers and even games (didn't mention the games bit in the OP: that's IMO a very BIG variable).

What i was trying to get answered is if his findings are still true once eliminating as many variables as possible.

This has serious implications because, if a proper review shows Adored's right, then the methodology's is flawed and needs to be scrapped.
TBH, I don't need more data to be quite sure that the FX 8350 scaled to be better over time. We had examples like this already when the CPU was quite new, when Crysis 3 utilized all its cores in one level (Djungle) it matched the speed of the i7 2600K or 2700K. Now that more games take advantage of more cores its obvious to me that the "10% faster" instead of "10% slower" should hold to be true.

Ryzen is no disappointment to anyone that had realistic expectations. Its gaming performance - I never expected it to be faster than the 7700K, I simply expected good to very good performance all around (games, apps, server) and I'm not disappointed at all. In the same time a lot of people were raging around I was like "it's performing quite good, seeing from where AMD did achieve this kind of performance you all should rather be happy than behave like this".

Also over time Ryzen will be easily easily better than the 7700K, it's just a quad core with high clocks, it has no chance once more than 4 cores are properly utilized, even the 1700 should be easily faster then. The games where it performs a bit disappointing are maybe, just maybe, utilizing the CPU the wrong way, that is why. We should know more about that, once the 4 core Ryzen's arrive, so games are prevented from crosstalking between Quad Core modules (CCX) to diminish latencies and bandwidth, because there's simply not a 2nd CCX that could possibly penalize this. As I see it, some games work properly, and some don't. Big disappointment for game lovers? I don't see why. When exactly did a new architecture happen to be fully utilized from the get go? Not even the Core architecture (Core i7 etc.) was fully utilized from the get go, and that despite it being related to the Core 2 architecture (Core 2 Quad etc.). Give it some freaking time. When FX 8350 can scale to be rather "okay", Ryzen will scale to be "great", I'm sure.

AMD did design Ryzen to be first of all a great server / workstation CPU, and they fully delivered and are even winning this against Intel - the big money is there, not in the gaming department. If AMD makes some big money that way, you can expect AMD to design better CPUs for gaming as well in the future - I believe this is also why Lisa Su always said "the best is yet to come". This is the first of a new great line of CPUs that will deliver for everyone - in time.
 
Let me try and be a bit more clear:

This topic is NOT about Ryzen: it's about gaming CPU benchmarking and it's current methodology!!!!

It just so happens that i used a video talking about the methodology AND about Ryzen.

Adored isn't the only one that can grab something out of someone else's review / video to try and make a point: in this case, about the validity (or lack there of) of the current methodology for gaming CPU benchmarking.
 
IMO, dropping the resolution in CPU benchmarks is just as bad as using graphs that doesn't start at zero.
Also, why are you surprised? Because Intel told us that going back in time to 2009* is best? And never mind our ignoring almost a decade of technological progress? :)

*that's when the first quad core came out. Yeap.
Try 2006. ;)

https://www.techpowerup.com/reviews/Intel/QX6700/
(480p gaming benchmarks included :D)
 
low res, low setting, fast gpu testing BY ITSELF is NOT good way to show CPU differences.
Settings are the real culprit here, some of them make game much less CPU intensive, right? In fact, some of the most popular performance tweaks in games are view distance, detail distance and/or some clutter density setting ... because those move substantial amount of pressure both from GPU and CPU.
In my opinion, low res (+ frame scaling at 0.25 :laugh:), ultra settings (with AA/post-processing/geometry on low), fast GPU testing is a pretty good way to show CPU differences in games ... but you gotta find a way to replicate significant stress reliably (in real game play).
Exact methodology would require to measure CPU and GPU hit for each setting, and appropriately choose settings so the GPU has least possible amount of work and CPU has maximum possible amount of work inside a frame.
The question is what do we gain when all we do is analyze relative performance in this case ... one goes through all that trouble just to show a very similar column graph, only slightly stretched :kookoo:
 
Last edited:
You can't tell me they had chips, tested them, knew they worked, saw the performance, and then unintentionally ignored the issues... they ignored the issues on purpose. I fully expect them to keep ignoring them, too.
This is the sort of thing that fills me with such confidence with AMD products.

/s

I'll stick to Intel and avoid the headaches, thanks.
 
Settings are the real culprit here, some of them make game much less CPU intensive, right? In fact, some of the most popular performance tweaks in games are view distance, detail distance and/or some clutter density setting ... because those move substantial amount of pressure both from GPU and CPU.
In my opinion, low res (+ frame scaling at 0.25 :laugh:), ultra settings (with AA/post-processing/geometry on low), fast GPU testing is a pretty good way to show CPU differences in games ... but you gotta find a way to replicate significant stress reliably (in real game play).
Exact methodology would require to measure CPU and GPU hit for each setting, and appropriately choose settings so the GPU has least possible amount of work and CPU has maximum possible amount of work inside a frame.
The question is what do we gain when all we do is analyze relative performance in this case ... one goes through all that trouble just to show a very similar column graph, only slightly stretched :kookoo:
I still don't buy it... I dont play at low res.. all this does, settings and all, is EXAGGERATE any effect..period. I want to see it with settings/res we play at...not settings/res which exacerbate the variable we are testing.
 
is EXAGGERATE any effect..period.
Well, the point is to exaggerate the effect when there is not much spread between the graph bar values (to show any fps difference between ryzen sku-s for example) ... argument "why do it when effect is already measurable at settings/res we play at" is a valid one when it applies and I pointed that out in my post.
 
I am saying exaggerating is the problem as it doesn't extrapolate to higher settings and resolutions. It is trying to describe an issue, which isn't much of one when run at 1080p with Ultra settings or higher (where people use these cards and those settings), so to MAKE it (more of an) an issue, its run at very low res with a high end card and low settings. Makes no sense whatsoever to me. None.

So, if I am understanding you are right, yes, it is a valid point, WHEN IT APPLIES, which is for those people that run 1070+ at lower than 1080p res on low settings. Please tell me there are zero people in this world which do that...

In my opinion, low res (+ frame scaling at 0.25 :laugh:), ultra settings (with AA/post-processing/geometry on low), fast GPU testing is a pretty good way to show CPU differences in games
In the end, I totally disagree with this statement. (and of course, that is OK. :))
 
There seems to be some confusion here: i'm asking about the methodology!

The way the current methodology works is that you test CPUs with a very fast GPU @ low resolutions / details (to eliminate the GPU as a variable) in a variety of gaming scenarios: this will tell if CPU X is better then CPU Y in gaming (and which ever other CPUs included in the review) and no matter what faster card you test again in the future will not change this outcome.

HOWEVER
, Adored has found this not the case since the example he showed shows a FX8350 going from over 10% slower to 10% faster and this contradicts the methodology's theory.

There's a BIG catch however, which is what i was trying to get tested: there were changes in the hardware used as well as drivers and even games (didn't mention the games bit in the OP: that's IMO a very BIG variable).

What i was trying to get answered is if his findings are still true once eliminating as many variables as possible.

This has serious implications because, if a proper review shows Adored's right, then the methodology's is flawed and needs to be scrapped.


The methodology isn't flawed. The only thing flawed is the people READING benchmark results.

A benchmark is the following:
"The measure of performance on a specific piece of hardware at a specific point in time, in a specific setup"

Chance parameters in these specifics, and you get different results.

There are several key differences in specifics when you pair an old CPU with a different GPU, or when you pair an old CPU with the same benchmark suite at a different point in time:

- GPU Drivers
- OS version & changes
- GPU architecture (for example: the difference between DX11 CPU load between AMD and Nvidia GPUs, even if they are on a similar performance level, would produce surprising results, even on similar performing CPUs; like so: http://www.tomshardware.com/reviews/crossfire-sli-scaling-bottleneck,3471.html
- API changes

So, bottom line: always use a benchmark that is:
- Relevant (to your use case, and your system - Ryzen release showed how people can forget this, judging workstation class CPU by mainstream/lower core count CPU standards)
- Recent (same architecture, same OS, same driver branch)

It ain't rocket science, but just quickly glancing over a few graphs DOES NOT give you any good information. Benchmark results need some attention to really grasp what they are trying to tell you.
 
Last edited:
So, if I am understanding you are right, yes, it is a valid point, WHEN IT APPLIES, which is for those people that run 1070+ at lower than 1080p res on low settings. Please tell me there are zero people in this world which do that...
You, me and everybody is aware how we, the people, play our games ... we put shit to ultra and resolution to native, sometimes we curse and swear because of the fps dips then adjust the settings only as much is needed :)
In other words, you are not understanding me right, let me put it the other way ... using high resolution when benching cpus in games is ok (it applies) when game is cpu hungry enough so you can get meaningful value range for your graph (for example if you don't like differences that are fractions of a frame) ... if you want wider value range for your graph in those couple of games (that's when it doesn't apply) you lesser the burden on the gpu and get less compressed graph. It's fine because you are trying to analyze relative cpu performance executing that particular game code. It's kinda game to game basis IMHO, to get a more readable graph.
Now, argument how we should always and exclusively always test cpu in games using uhd res is resonable to an extent only if somehow lowering the resolution lowers the amount of job cpu has to do inside a frame. Does it? Valid question because games usually adjust level of detail system at higher resolutions, but LOD is all about geometry, not draw calls. Tough one.
 
but LOD is all about geometry, not draw calls.

But those are actually linked , higher levels of geometry = higher number of draw calls. Depending on how the batches are handled by the engine it may not impact performance that much, but it still has an effect. Though most games today use a fixed level of geometry. That's why in some games low vs high doesn't look that far apart and CPU usage remains almost the same. Things like tessellation can still has this effect though if not implemented efficiently, it's the reason why AMD cards got outperformed by Nvidia in this category with their weak DX11 drivers in the past that were hammered by this.
 
Last edited:
Back
Top