Thursday, March 31st 2022

GPU Hardware Encoders Benchmarked on AMD RDNA2 and NVIDIA Turing Architectures

Encoding video is one of the significant tasks that modern hardware performs. Today, we have some data of AMD and NVIDIA solutions for the problem that shows how good GPU hardware encoders are. Thanks to Chips and Cheese tech media, we have information about AMD's Video Core Next (VCN) encoder found in RDNA2 GPUs and NVIDIA's NVENC (short for NVIDIA Encoder). The site managed to benchmark AMD's Radeon RX 6900 XT and NVIDIA GeForce RTX 2060 GPUs. The AMD card features VCN 3.0, while the NVIDIA Turing card features a 6th generation NVENC design. Team red is represented by the latest work, while there exists a 7th generation of NVENC. C&C tested this because it means all that the reviewer possesses.

The metric used for video encoding was Netflix's Video Multimethod Assessment Fusion (VMAF) metric composed by the media giant. In addition to hardware acceleration, the site also tested software acceleration done by libx264, a software library used for encoding video streams into the H.264/MPEG-4 AVC compression format. The libx264 software acceleration was running on AMD Ryzen 9 3950X. Benchmark runs included streaming, recording, and transcoding in Overwatch and Elder Scrolls Online.
Below, you can find benchmarks of streaming, recording, transcoding, and transcoding speed.

Streaming:
Recording:
Transcoding:
Transcoding Speed:

For details on VCN and NVENC output visuals, please check out the Chips and Cheese website to see the comparison in greater detail.
Source: Chips and Cheese
Add your own comment

20 Comments on GPU Hardware Encoders Benchmarked on AMD RDNA2 and NVIDIA Turing Architectures

#1
Lightofhonor
So basically if you want high-bitrate it doesn't really matter, lower bit-rates choose nVidia.
Posted on Reply
#2
RH92
LightofhonorSo basically if you want high-bitrate it doesn't really matter, lower bit-rates choose nVidia.
Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .
Posted on Reply
#3
looniam
RH92Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .
ampere's improvements w/DEcoding first GPU vendor to support AV1 decode NVDEC
whereas, NVENC (page 41 whitepaper)
GA10x GPUs include the seventh generation NVENC encoder unit that was introduced with the Turing architecture. With common Twitch and YouTube streaming settings, NVENC-based hardware encoding in GA10x GPUs exceeds the encoding quality of software-based x264 encoders using the Fast preset and is on par with x264 Medium, a preset that typically requires a dual PC setup. This dramatically lowers CPU utilization. 4K encoding is too heavy a workload for a typical CPU setup, but the GA10x NVENC encoder makes high resolution encoding seamless up to 4K on H.264, and even 8K on HEVC.
i had to look and it seems the reviewer knew what they were doing. :)
Posted on Reply
#4
HisDivineOrder
RH92Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .
This is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.
Posted on Reply
#5
noel_fs
HisDivineOrderThis is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.
couldnt be more right
Posted on Reply
#6
mechtech
What would have been nice would be testing low end cards hw decoding for power use and lack of stuttering watching Netflix, YouTube, etc at 1080, 1440, 2160, with a crap ton of different codecs. good for htpc and non-gamers with weaker systems.
Posted on Reply
#7
TheoneandonlyMrK
I have no idea what you're looking at they're all within 10% and AMD is on ever graph ,did I miss something, because I also have never had a video say no either.
Posted on Reply
#8
dragontamer5788


Wait, Wut?

6900 XT doesn't seem to care about MBPS? It just always performs at 45fps ? How strange.
TheoneandonlyMrKI have no idea what you're looking at they're all within 10% and AMD is on ever graph ,did I miss something, because I also have never had a video say no either.
VMAF is the "quality" of the transcoding. How many compression artifacts there are, how much like the "original" it all looks like. As you may (or may not) know, modern "compression" is largely about deleting data that humans "probably" won't see or notice if the data was missing. So a modern encoder is about finding all of these bits of the picture that can be deleted.

VMAF is an automatic tool that tries to simulate human vision. At 100, it says that humans won't be able to tell the difference. At 0, everyone will immediately notice.

--------

The other graphs at play here are speed.

Software / CPU has the highest VMAF, especially at "very slow" settings.

NVidia is the 2nd best.

AMD is 3rd / the worst at VMAF.

-------

Software / CPU is the slowest.

AMD is fast at high Mbps. NVidia is fast at low Mbps.
Posted on Reply
#9
TheoneandonlyMrK
dragontamer5788

Wait, Wut?

6900 XT doesn't seem to care about MBPS? It just always performs at 45fps ? How strange.



VMAF is the "quality" of the transcoding. How many compression artifacts there are, how much like the "original" it all looks like. As you may (or may not) know, modern "compression" is largely about deleting data that humans "probably" won't see or notice if the data was missing. So a modern encoder is about finding all of these bits of the picture that can be deleted.

VMAF is an automatic tool that tries to simulate human vision. At 100, it says that humans won't be able to tell the difference. At 0, everyone will immediately notice.

--------

The other graphs at play here are speed.

Software / CPU has the highest VMAF, especially at "very slow" settings.

NVidia is the 2nd best.

AMD is 3rd / the worst at VMAF.

-------

Software / CPU is the slowest.

AMD is fast at high Mbps. NVidia is fast at low Mbps.
But not in the range of human perception.
I think I need to look on a bigger screen, thanks for the info I'm not That clued up.
Posted on Reply
#10
RedBear
I applaud the efforts to compare the hardware encoders, it's a point that gets very often ignored in regular reviews (as it was quite apparent when the RX 6500 XT came out and many reviewers just glimpsed the lack of hardware encoding and decoding capabilities), but comparing a low-middle range GPU of the previous generation against the latest and best of the competitor makes the whole attempt kind of specious.
Posted on Reply
#11
dragontamer5788
TheoneandonlyMrKBut not in the range of human perception.
I think I need to look on a bigger screen, thanks for the info I'm not That clued up.
chipsandcheese.com/2022/03/30/gpu-hardware-video-encoders-how-good-are-they/

The original chips-and-cheese website has the actual pictures from the tests if you wanna see the difference.
RedBearI applause the efforts to compare the hardware encoders, it's a point that gets very often ignored in regular reviews (as it was quite apparent when the RX 6500 XT came out and many reviewers just glimpsed the lack of hardware encoding and decoding capabilities), but comparing a low-middle range GPU of the previous generation against the latest and best of the competitor makes the whole attempt kind of specious.
Smaller, niche websites don't have as much hardware to do a "proper apples to apples test".

I'm glad that they have the time to do a test at all. But yeah, AMD 6900 xt vs NVidia 1080 or 2060 is a bit of a funny comparison.
Posted on Reply
#12
Xajel
HisDivineOrderThis is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.
I don't want AMD to copy NVenc, I want them to compete against both NV and Apple's Silicon and surpass both, both companies do a much better job in this regard not just for consumer applications but even for professional applications (Video Editing), just see how Apple Silicon can do if the software was optimized for it.

The main other thing AMD needs to focus much more on it in the Content Creation side of things, not just Video Editing/processing but also Accelerated 3D rendering. NV is doing so well here that its a no brainer for any content creator to use NV solutions. AMD tried to dive into this market but its still far from what NV has achieved. And now Apple is entering the game with their Silicon very strongly.

Just see how Blender & V-Ray renders using NV OptiX. It can accelerate using all the resources possible especially the RT cores to accelerate RT rendering, they have been doing it almost since RTX launched. AMD just added support for only RDNA2 in Blender 3.0 and still without RT accelerating, and nothing else in other major 3D renderers like V-Ray.
Posted on Reply
#13
Mysteoa
This was something I was aware for a long time. I don't know why AMD doesn't care much for the encoding/decoding on their GPU. I expected them to have a bigger push for streamers to adopt their GPU this gen, but it looks like they don't care about that market. To me, it looks like they don't put as many resources as they do for their CPU. It's somewhat understandable when you compare how much margin a CPU has compared to a GPU and how much software it needs. It would have also boosted their CPU sales, if GPU were also doing better.
Posted on Reply
#14
looniam
MysteoaThis was something I was aware for a long time. I don't know why AMD doesn't care much for the encoding/decoding on their GPU. I expected them to have a bigger push for streamers to adopt their GPU this gen, but it looks like they don't care about that market. To me, it looks like they don't put as many resources as they do for their CPU. It's somewhat understandable when you compare how much margin a CPU has compared to a GPU and how much software it needs. It would have also boosted their CPU sales, if GPU were also doing better.
they have come a long way in the last decade:
www.tomshardware.com/reviews/video-transcoding-amd-app-nvidia-cuda-intel-quicksync,2839.html
pretty sad when sandy bridge's igpu (first gen QS) beat them.

however, for most uses, just some gamerz on twitch, anything is fine, professions will want software encoding as usual until a miracle happens!
Posted on Reply
#15
elghinnarisa
The scale is model based on human testing where they are put infront of a TV at specific distances and gets to watch a video. They then get to vote on a metric of quality "Bad Poor Fair Good Excellent"
Where excellent is mapped to 100, and bad is mapped to 0.
At around ~85 there is a perceivable loss in quality, but its generally not annoying.
at around ~75 it starts to become annoying
And 60 or below is pretty terrible, but could still be acceptable in terms of filesize to quality ratios.
Generally one would aim somewhere between 92-95 as a good target to produce videos with a decent amount of compression but generally smaller than the original. Though, no one would use x264 for that today, its old and useless. VP9 or x265 at a minimum, both of which are less sensitive to thread count.
You can still quite easily notice difference even at higher score values, like >98.x. However if that noticeable difference matters or not is a question thats rather subjective.
They also do not produce all the scores and metrics vmaf has. The "score" is just mean value. Min, Max, Mean, Harmonic mean is the generally values you get out. I find all of them more important than the Mean in most cases. Since the difference between them can be rather large. See this example of the output you get:


<metric name="motion2" min="0.000000" max="17.513954" mean="9.934909" harmonic_mean="5.748033" />
<metric name="motion" min="0.000000" max="20.251728" mean="10.346691" harmonic_mean="6.487637" />
<metric name="adm2" min="0.440108" max="0.938360" mean="0.731659" harmonic_mean="0.711730" />
<metric name="adm_scale0" min="0.718875" max="0.910788" mean="0.815849" harmonic_mean="0.814638" />
<metric name="adm_scale1" min="0.438301" max="0.895599" mean="0.670334" harmonic_mean="0.654797" />
<metric name="adm_scale2" min="0.327092" max="0.939129" mean="0.684322" harmonic_mean="0.650081" />
<metric name="adm_scale3" min="0.387338" max="0.970459" mean="0.775667" harmonic_mean="0.750280" />
<metric name="vif_scale0" min="0.014387" max="0.408880" mean="0.180496" harmonic_mean="0.164509" />
<metric name="vif_scale1" min="0.041550" max="0.835502" mean="0.434883" harmonic_mean="0.372053" />
<metric name="vif_scale2" min="0.067117" max="0.913851" mean="0.534008" harmonic_mean="0.461327" />
<metric name="vif_scale3" min="0.113093" max="0.950210" mean="0.615081" harmonic_mean="0.546858" />
<metric name="vmaf" min="5.100550" max="87.312631" mean="43.527449" harmonic_mean="26.256242" />
And of course, one doesn't really use bitrate arguments these days unless you have a very specific purpose, constant rate factor is where its at for producing far superior results.

I did test nvenc encoding vs CPU encoding in PowerDirector at the start but I had such a small sample size that I decided to wait until later. Though if anything, GPU encodes consistently average higher scores in the 5 test videos I used at the time, further testing is needed.
Although the work on said scripts are halted for learning python so I can better produce a cross-platform tool for measurements and crunching and all other fancy stuff i wanted to do with it.

Other notes I found so far:
Most video files that many modern games have these days are way too large for their own good.
For those who like anime - Not a single fan sub group is able to produce reasonable quality to filesize videos (they are~80% larger than they should be), Exceptions are some of the few x265 recoders out there.
VP9 is super nice to use compared to x265, though x265 is nice and speedy and x264 is just laughable today in how horrible it is.
If you use VP9 then enabling tiling and throwing extra threads at it for higher encoding speeds does not impact quality at all

Here are some of the later results I had, and that is indeed a file ending up as 15% of the original size with a metric of 94 for both mean and harmonic mean. Though as one can see, the minimum values are starting to drop, and that can be noticeable.
So many things today are just made with static bitrate values with no concern for filesize and because of that, a lot of videos are rather bloated today. Which was why I started to play around with this to start with a few months back. Because I was curious how bloated (or non-bloated) things were.

[LEFT]size in bytes[/LEFT][LEFT]min[/LEFT][LEFT]max[/LEFT][LEFT]mean[/LEFT][LEFT]harmonic_mean[/LEFT][LEFT]ratio[/LEFT][LEFT]Percent of original size[/LEFT][LEFT][/LEFT][LEFT][/LEFT]
[LEFT][/LEFT][LEFT][/LEFT][LEFT][/LEFT][LEFT][/LEFT][LEFT][/LEFT][LEFT][/LEFT][LEFT][/LEFT][LEFT]original size[/LEFT][RIGHT]1447982328[/RIGHT]
[RIGHT]322606578[/RIGHT][RIGHT]81.591395[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.671001[/RIGHT][RIGHT]95.613573[/RIGHT][RIGHT]2.96556262408264E-07[/RIGHT][RIGHT]22.28%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]306821699[/RIGHT][RIGHT]80.697832[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.578825[/RIGHT][RIGHT]95.518704[/RIGHT][RIGHT]3.11512599374531E-07[/RIGHT][RIGHT]21.19%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]288320663[/RIGHT][RIGHT]80.11331[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.464203[/RIGHT][RIGHT]95.40166[/RIGHT][RIGHT]3.31104271219021E-07[/RIGHT][RIGHT]19.91%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]269377784[/RIGHT][RIGHT]79.701384[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.353233[/RIGHT][RIGHT]95.287672[/RIGHT][RIGHT]3.53975860904699E-07[/RIGHT][RIGHT]18.60%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]258002177[/RIGHT][RIGHT]78.624644[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.238739[/RIGHT][RIGHT]95.170078[/RIGHT][RIGHT]3.69139284433247E-07[/RIGHT][RIGHT]17.82%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]244100458[/RIGHT][RIGHT]78.393402[/RIGHT][RIGHT]100[/RIGHT][RIGHT]95.130818[/RIGHT][RIGHT]95.059777[/RIGHT][RIGHT]3.89719948825332E-07[/RIGHT][RIGHT]16.86%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
[RIGHT]222567224[/RIGHT][RIGHT]77.236843[/RIGHT][RIGHT]100[/RIGHT][RIGHT]94.895375[/RIGHT][RIGHT]94.817218[/RIGHT][RIGHT]4.2636724893509E-07[/RIGHT][RIGHT]15.37%[/RIGHT][LEFT][/LEFT][LEFT][/LEFT]
Posted on Reply
#16
Sipu
So integrated gpu is worse? Or better? Is smart tv worse then than... Like shield or some other android box? This raises more questions than it answers
Posted on Reply
#17
tony359
Wait - I’ve always thought that software encoding via CPU was the best solution (but not the fastest) at least with my 1050Ti card.
The review here seems to suggest otherwise?
Posted on Reply
#18
Aquinus
Resident Wat-man
These ASICs are common between different GPUs of the same generation aren't they? So regardless if you're using a 2060 or a 2080 Ti, wouldn't the encoding/decoding block still perform the same since it's the same circuit in both GPUs? I would expect something similar with RDNA2 chips with the same VCN block.
Posted on Reply
#19
dragontamer5788
tony359Wait - I’ve always thought that software encoding via CPU was the best solution (but not the fastest) at least with my 1050Ti card.
The review here seems to suggest otherwise?
CPU still has highest VMAF in the tests as described in the article.

But to get there, you need to run the CPU encoder under 'very slow' settings. For streaming, that's too slow to be useful. So they used 'very fast' settings (worse quality) for the streaming test
Posted on Reply
#20
msimax
dragontamer5788CPU still has highest VMAF in the tests as described in the article.

But to get there, you need to run the CPU encoder under 'very slow' settings. For streaming, that's too slow to be useful. So they used 'very fast' settings (worse quality) for the streaming test
i run 1080p slow and even medium x264 for live streaming and its very useful
Posted on Reply
Add your own comment
Jun 28th, 2022 07:28 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts