• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel's New Tool Measures Video Game Image Quality in Real Time

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
3,251 (1.12/day)
Intel has introduced the Computer Graphics Visual Quality Metric (CGVQM) tool, the first open-source metric specifically designed to evaluate real-time game imagery with nearly the same accuracy as humans. At its core is the CG-VQD dataset, which contains 80 three-second clips from 15 open-source 3D scenes, ranging from Amazon's Bistro demo to custom environments like House and Bridge, each processed by one of six modern rendering methods, such as neural supersampling, path tracing, or Gaussian splatting. Intel researchers began with a pre-trained 3D ResNet‑18 network and fine-tuned its channel weights to align its outputs with volunteer ratings. The result, CGVQM‑5, outperforms all existing full‑reference metrics and comes very close to matching human agreement.

Behind the scenes, CGVQM splits each video into smaller patches, extracts key visual features using a 3D ResNet backbone, and then adjusts a small set of channel-wise weights so that its predicted scores closely match the quality ratings given by human testers. CGVQM 5 digs deep into all five ResNet blocks for top accuracy. To make the tool practical for swift build pipelines, the team also created CGVQM‑2, a lighter version that uses only the first two ResNet blocks. By removing most of the latter features, it runs substantially faster while still beating every rival metric. Both variants produce error maps that clearly highlight artifacts, such as ghosting or flicker, allowing artists to spot and fix issues without running complete user tests. Game developers can clone the GitHub repository and add Vulkan hooks or Unreal Engine plugins to integrate CGVQM directly into their workflows, enabling them to run evaluations on the fly.



View at TechPowerUp Main Site | Source
 
Neat! Puts a tangible number on the combination of framerate + image quality.
I envision new "CGVQM score" and "CGVQM score per store price" benchmark charts.
Intel must think they're gonna score well.
 
If this can do real-time analysis on the quality, is quick enough and has a low enough overhead, I'd be interested in seeing this used for an "Adaptive Quality" setting in games to allow you to target either a level of quality but at the expense of lower framerates perhaps, or a set framerate and it reduces the quality on things that are less percetible first to maintain framerate.
 
Thats pretty cool. I wonder how well it would work with other art styles like toon shading though

It's measuring quality by comparing two videos, not by arbitrarily grading them. Both the reference and the video comparing to would share the same art style.


Reading the source, it unfortunately appears this metric only outputs error maps, giving each error a score of 0 100 (lower being worse), but does not calculate a final score. It's far past due that reviews start including objective visual quality measurements as we increasingly have technologies that compromise visual quality and as game engines silently depredate texture quality or resort to texture pop-in when hitting VRAM limitations. There is a need to ensure visual quality is consistent across all cards and to provide the end user a hard number of the trade-offs they are making with given settings.

Neat! Puts a tangible number on the combination of framerate + image quality.
I envision new "CGVQM score" and "CGVQM score per store price" benchmark charts.
Intel must think they're gonna score well.

As far as I can see from the github, it doesn't assign a final score. Only scores individual errors. Someone could in theory use this to make a score from this data but the lack of official one is not going to help adoption.

I'm not really sure about their approach to calibrating this metric based on the responses of human participants either without knowing how exactly they collected said responses and how many people we are talking.

Ultimately this has to be thoroughly assessed before being put in use.
 
Makes sense.
Comparative scores can be just as helpful. E.g. one could take the best CGVQM score as 100%, and see how the rest stack up.
(There are rating frameworks for relative scores, e.g. en.wikipedia.org/wiki/Elo_rating_system , but I wouldn't overcomplicate it.)
 
Intel researchers began with a pre-trained 3D ResNet‑18 network and fine-tuned its channel weights
Huh, that's a really small model. I personally have been using resnet for most of my academical stuff, but have been planning on moving on to other transformer-based backbones due to their extra "granularity".
 
It's measuring quality by comparing two videos, not by arbitrarily grading them. Both the reference and the video comparing to would share the same art style.


Reading the source, it unfortunately appears this metric only outputs error maps, giving each error a score of 0 100 (lower being worse), but does not calculate a final score. It's far past due that reviews start including objective visual quality measurements as we increasingly have technologies that compromise visual quality and as game engines silently depredate texture quality or resort to texture pop-in when hitting VRAM limitations. There is a need to ensure visual quality is consistent across all cards and to provide the end user a hard number of the trade-offs they are making with given settings.



As far as I can see from the github, it doesn't assign a final score. Only scores individual errors. Someone could in theory use this to make a score from this data but the lack of official one is not going to help adoption.

I'm not really sure about their approach to calibrating this metric based on the responses of human participants either without knowing how exactly they collected said responses and how many people we are talking.

Ultimately this has to be thoroughly assessed before being put in use.
It's trivial to derive a mean and standard deviation from that.
 
It's trivial to derive a mean and standard deviation from that.

Creating a mean only from the errors spotted and graded by the Intel tool is not a proper way to do a frame quality metric.

If you look at other video quality analysis tools, they analyze the quality of every frame, not just spotted errors. You cannot come to any conclusion in regard to overall frame quality with just the number of errors and severity of them, as that will be heavily influenced by video length. One video might have 10,000 frames with 1 error vs 1,000 frames with 1 error on a short video. Even if you did create a score accounting for total number of frames vs number of errors, it still lacks per frame quality. A particular game can have lower quality individual frames but less errors. Hence why any frame quality metric without per frame values is not intended to be used to assess quality. This is why when I said it was possible to create a score based on total frames vs errors but cautioned that it was insufficient for any real analysis.

Thus this tool is clearly targeted at developers. It's purpose is to find errors and assign them a severity score, not to judge the quality of every frame. While I would love a tool that is designed to assess frame quality from videos games, this is absolutely not that.
 
Last edited:
Interesting concept, and it could be rather revealing in a world of frame gen, RT, and dynamic upscaling. Still, at the end of the day, what should matter is a bit subjective--does the user notice the loss of quality, and do they care? That answer is going to vary wildly. Personally, I only cared when it pulled me out of the game, though I do appreciate the occasional funny rendering errors that can occur--like floating architecture because the columns didn't get filled or something like that.
 
Unless this is opensource, expect copy cats, and probably better implementations.
 
Unless this is opensource, expect copy cats, and probably better implementations.
I suggest you to re-read the OP, it has a really interesting hyperlink in there, no need to even open up the actual sources.
 
I suggest you to re-read the OP, it has a really interesting hyperlink in there, no need to even open up the actual sources.
yea i didnt click the link to look.
 
the first open-source metric specifically designed to evaluate real-time game imagery with nearly the same accuracy as humans

Okay but why does it not compare to NATIVE resolution?
Imo this tool is pointless as it's basis is a bias subjective human grouping. Who's eyes can vary in far degrees. of near sighted short sighted & some what tunnel visioned during gaming.
 
Back
Top