• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

GPU Hardware Encoders Benchmarked on AMD RDNA2 and NVIDIA Turing Architectures

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,232 (0.91/day)
Encoding video is one of the significant tasks that modern hardware performs. Today, we have some data of AMD and NVIDIA solutions for the problem that shows how good GPU hardware encoders are. Thanks to Chips and Cheese tech media, we have information about AMD's Video Core Next (VCN) encoder found in RDNA2 GPUs and NVIDIA's NVENC (short for NVIDIA Encoder). The site managed to benchmark AMD's Radeon RX 6900 XT and NVIDIA GeForce RTX 2060 GPUs. The AMD card features VCN 3.0, while the NVIDIA Turing card features a 6th generation NVENC design. Team red is represented by the latest work, while there exists a 7th generation of NVENC. C&C tested this because it means all that the reviewer possesses.

The metric used for video encoding was Netflix's Video Multimethod Assessment Fusion (VMAF) metric composed by the media giant. In addition to hardware acceleration, the site also tested software acceleration done by libx264, a software library used for encoding video streams into the H.264/MPEG-4 AVC compression format. The libx264 software acceleration was running on AMD Ryzen 9 3950X. Benchmark runs included streaming, recording, and transcoding in Overwatch and Elder Scrolls Online.

Below, you can find benchmarks of streaming, recording, transcoding, and transcoding speed.


Streaming:


Recording:


Transcoding:


Transcoding Speed:


For details on VCN and NVENC output visuals, please check out the Chips and Cheese website to see the comparison in greater detail.

View at TechPowerUp Main Site | Source
 
Joined
Dec 19, 2008
Messages
284 (0.05/day)
Location
WA, USA
System Name Desktop
Processor AMD Ryzen 5950X
Motherboard ASUS Strix B450-I
Cooling be quiet! Dark Rock TF 2
Memory 32GB DDR4 3600
Video Card(s) AMD RX 6800
Storage 480GB MyDigitalSSD NVME
Display(s) AOC CU34G2X
Power Supply 850w
Mouse Razer Basilisk V3
Keyboard Steelseries Apex 5
So basically if you want high-bitrate it doesn't really matter, lower bit-rates choose nVidia.
 
Joined
Oct 4, 2017
Messages
695 (0.29/day)
Location
France
Processor RYZEN 7 5800X3D
Motherboard Aorus B-550I Pro AX
Cooling HEATKILLER IV PRO , EKWB Vector FTW3 3080/3090 , Barrow res + Xylem DDC 4.2, SE 240 + Dabel 20b 240
Memory Viper Steel 4000 PVS416G400C6K
Video Card(s) EVGA 3080Ti FTW3
Storage XPG SX8200 Pro 512 GB NVMe + Samsung 980 1TB
Display(s) Dell S2721DGF
Case NR 200
Power Supply CORSAIR SF750
Mouse Logitech G PRO
Keyboard Meletrix Zoom 75 GT Silver
Software Windows 11 22H2
So basically if you want high-bitrate it doesn't really matter, lower bit-rates choose nVidia.

Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .
 
Joined
Jan 27, 2015
Messages
1,065 (0.32/day)
System Name loon v4.0
Processor i7-11700K
Motherboard asus Z590TUF+wifi
Cooling Custom Loop
Memory ballistix 3600 cl16
Video Card(s) eVga 3060 xc
Storage WD sn570 1tb(nvme) SanDisk ultra 2tb(sata)
Display(s) cheap 1080&4K 60hz
Case Roswell Stryker
Power Supply eVGA supernova 750 G6
Mouse eats cheese
Keyboard warrior!
Benchmark Scores https://www.3dmark.com/spy/21765182 https://www.3dmark.com/pr/1114767
Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .
ampere's improvements w/DEcoding first GPU vendor to support AV1 decode NVDEC
whereas, NVENC (page 41 whitepaper)
GA10x GPUs include the seventh generation NVENC encoder unit that was introduced with the Turing architecture. With common Twitch and YouTube streaming settings, NVENC-based hardware encoding in GA10x GPUs exceeds the encoding quality of software-based x264 encoders using the Fast preset and is on par with x264 Medium, a preset that typically requires a dual PC setup. This dramatically lowers CPU utilization. 4K encoding is too heavy a workload for a typical CPU setup, but the GA10x NVENC encoder makes high resolution encoding seamless up to 4K on H.264, and even 8K on HEVC.

i had to look and it seems the reviewer knew what they were doing. :)
 
Joined
Aug 23, 2013
Messages
547 (0.14/day)
Keep in mind this is a 2060 giving a run for it's money to a 6900XT ... as stated in the article 2060 is on the 6th gen of NVENC while Ampere brings further imporvements . I don't think there is any choice when it comes to streaming , Nvidia is the only choice .

This is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.
 
Joined
May 31, 2017
Messages
417 (0.17/day)
Processor Ryzen 5700X
Motherboard Gigabyte B550 Arous Elite V2
Cooling Thermalright PA120
Memory Kingston FURY Renegade 3600Mhz @ 3733 tight timings
Video Card(s) Sapphire Pulse RX 6800
Storage 36TB
Display(s) Samsung QN90A
Case be quiet! Dark Base Pro 900
Audio Device(s) Khadas Tone Pro 2, HD660s, KSC75, JBL 305 MK1
Power Supply Coolermaster V850 Gold V2
Mouse Roccat Burst Pro
Keyboard Dogshit with Otemu Brown
Software W10 LTSC 2021
This is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.
couldnt be more right
 
Joined
Dec 26, 2006
Messages
3,530 (0.56/day)
Location
Northern Ontario Canada
Processor Ryzen 5700x
Motherboard Gigabyte X570S Aero G R1.1 BiosF5g
Cooling Noctua NH-C12P SE14 w/ NF-A15 HS-PWM Fan 1500rpm
Memory Micron DDR4-3200 2x32GB D.S. D.R. (CT2K32G4DFD832A)
Video Card(s) AMD RX 6800 - Asus Tuf
Storage Kingston KC3000 1TB & 2TB & 4TB Corsair LPX
Display(s) LG 27UL550-W (27" 4k)
Case Be Quiet Pure Base 600 (no window)
Audio Device(s) Realtek ALC1220-VB
Power Supply SuperFlower Leadex V Gold Pro 850W ATX Ver2.52
Mouse Mionix Naos Pro
Keyboard Corsair Strafe with browns
Software W10 22H2 Pro x64
What would have been nice would be testing low end cards hw decoding for power use and lack of stuttering watching Netflix, YouTube, etc at 1080, 1440, 2160, with a crap ton of different codecs. good for htpc and non-gamers with weaker systems.
 
Joined
Mar 10, 2010
Messages
11,878 (2.30/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
I have no idea what you're looking at they're all within 10% and AMD is on ever graph ,did I miss something, because I also have never had a video say no either.
 
Joined
Apr 24, 2020
Messages
2,563 (1.75/day)
1648760420724.png


Wait, Wut?

6900 XT doesn't seem to care about MBPS? It just always performs at 45fps ? How strange.

I have no idea what you're looking at they're all within 10% and AMD is on ever graph ,did I miss something, because I also have never had a video say no either.

VMAF is the "quality" of the transcoding. How many compression artifacts there are, how much like the "original" it all looks like. As you may (or may not) know, modern "compression" is largely about deleting data that humans "probably" won't see or notice if the data was missing. So a modern encoder is about finding all of these bits of the picture that can be deleted.

VMAF is an automatic tool that tries to simulate human vision. At 100, it says that humans won't be able to tell the difference. At 0, everyone will immediately notice.

--------

The other graphs at play here are speed.

Software / CPU has the highest VMAF, especially at "very slow" settings.

NVidia is the 2nd best.

AMD is 3rd / the worst at VMAF.

-------

Software / CPU is the slowest.

AMD is fast at high Mbps. NVidia is fast at low Mbps.
 
Joined
Mar 10, 2010
Messages
11,878 (2.30/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
View attachment 241947

Wait, Wut?

6900 XT doesn't seem to care about MBPS? It just always performs at 45fps ? How strange.



VMAF is the "quality" of the transcoding. How many compression artifacts there are, how much like the "original" it all looks like. As you may (or may not) know, modern "compression" is largely about deleting data that humans "probably" won't see or notice if the data was missing. So a modern encoder is about finding all of these bits of the picture that can be deleted.

VMAF is an automatic tool that tries to simulate human vision. At 100, it says that humans won't be able to tell the difference. At 0, everyone will immediately notice.

--------

The other graphs at play here are speed.

Software / CPU has the highest VMAF, especially at "very slow" settings.

NVidia is the 2nd best.

AMD is 3rd / the worst at VMAF.

-------

Software / CPU is the slowest.

AMD is fast at high Mbps. NVidia is fast at low Mbps.
But not in the range of human perception.
I think I need to look on a bigger screen, thanks for the info I'm not That clued up.
 
Joined
Apr 11, 2021
Messages
214 (0.19/day)
I applaud the efforts to compare the hardware encoders, it's a point that gets very often ignored in regular reviews (as it was quite apparent when the RX 6500 XT came out and many reviewers just glimpsed the lack of hardware encoding and decoding capabilities), but comparing a low-middle range GPU of the previous generation against the latest and best of the competitor makes the whole attempt kind of specious.
 
Last edited:
Joined
Apr 24, 2020
Messages
2,563 (1.75/day)
But not in the range of human perception.
I think I need to look on a bigger screen, thanks for the info I'm not That clued up.


The original chips-and-cheese website has the actual pictures from the tests if you wanna see the difference.

I applause the efforts to compare the hardware encoders, it's a point that gets very often ignored in regular reviews (as it was quite apparent when the RX 6500 XT came out and many reviewers just glimpsed the lack of hardware encoding and decoding capabilities), but comparing a low-middle range GPU of the previous generation against the latest and best of the competitor makes the whole attempt kind of specious.

Smaller, niche websites don't have as much hardware to do a "proper apples to apples test".

I'm glad that they have the time to do a test at all. But yeah, AMD 6900 xt vs NVidia 1080 or 2060 is a bit of a funny comparison.
 
Joined
Apr 8, 2008
Messages
328 (0.06/day)
This is why I don't consider AMD's GPU's and it's not even a new problem. It's been a problem for several generations and you'd think after the 20 series when Nvidia touted the nvenc improvements (along with DLSS and RT) as reasons to buy their cards, AMD would be, I don't know, trying to be a card streamers actually use.

Instead, all AMD is focused on is DLSS and making a new spin on TAA that they can brand as theirs. Hey, AMD. Wake up and make a copy of nvenc with your own branding. Call it freenc. Whatever. Help me not be hostage to nvidia.

I don't want AMD to copy NVenc, I want them to compete against both NV and Apple's Silicon and surpass both, both companies do a much better job in this regard not just for consumer applications but even for professional applications (Video Editing), just see how Apple Silicon can do if the software was optimized for it.

The main other thing AMD needs to focus much more on it in the Content Creation side of things, not just Video Editing/processing but also Accelerated 3D rendering. NV is doing so well here that its a no brainer for any content creator to use NV solutions. AMD tried to dive into this market but its still far from what NV has achieved. And now Apple is entering the game with their Silicon very strongly.

Just see how Blender & V-Ray renders using NV OptiX. It can accelerate using all the resources possible especially the RT cores to accelerate RT rendering, they have been doing it almost since RTX launched. AMD just added support for only RDNA2 in Blender 3.0 and still without RT accelerating, and nothing else in other major 3D renderers like V-Ray.
 
Joined
Aug 23, 2013
Messages
454 (0.12/day)
This was something I was aware for a long time. I don't know why AMD doesn't care much for the encoding/decoding on their GPU. I expected them to have a bigger push for streamers to adopt their GPU this gen, but it looks like they don't care about that market. To me, it looks like they don't put as many resources as they do for their CPU. It's somewhat understandable when you compare how much margin a CPU has compared to a GPU and how much software it needs. It would have also boosted their CPU sales, if GPU were also doing better.
 
Joined
Jan 27, 2015
Messages
1,065 (0.32/day)
System Name loon v4.0
Processor i7-11700K
Motherboard asus Z590TUF+wifi
Cooling Custom Loop
Memory ballistix 3600 cl16
Video Card(s) eVga 3060 xc
Storage WD sn570 1tb(nvme) SanDisk ultra 2tb(sata)
Display(s) cheap 1080&4K 60hz
Case Roswell Stryker
Power Supply eVGA supernova 750 G6
Mouse eats cheese
Keyboard warrior!
Benchmark Scores https://www.3dmark.com/spy/21765182 https://www.3dmark.com/pr/1114767
This was something I was aware for a long time. I don't know why AMD doesn't care much for the encoding/decoding on their GPU. I expected them to have a bigger push for streamers to adopt their GPU this gen, but it looks like they don't care about that market. To me, it looks like they don't put as many resources as they do for their CPU. It's somewhat understandable when you compare how much margin a CPU has compared to a GPU and how much software it needs. It would have also boosted their CPU sales, if GPU were also doing better.
they have come a long way in the last decade:
pretty sad when sandy bridge's igpu (first gen QS) beat them.

however, for most uses, just some gamerz on twitch, anything is fine, professions will want software encoding as usual until a miracle happens!
 
Joined
Nov 8, 2020
Messages
474 (0.37/day)
System Name Dusty
Processor 5900x
Motherboard MSI B550 Tomahawk
Cooling Noctua NH-D15
Memory Corsair Vengence LPX 32GB
Video Card(s) MSI RTX 3070 Gaming X
Storage yes
Case Fractal Design Define R6
Power Supply EVGA SuperNOVA 750w
VR HMD Oculus CV1
The scale is model based on human testing where they are put infront of a TV at specific distances and gets to watch a video. They then get to vote on a metric of quality "Bad Poor Fair Good Excellent"
Where excellent is mapped to 100, and bad is mapped to 0.
At around ~85 there is a perceivable loss in quality, but its generally not annoying.
at around ~75 it starts to become annoying
And 60 or below is pretty terrible, but could still be acceptable in terms of filesize to quality ratios.
Generally one would aim somewhere between 92-95 as a good target to produce videos with a decent amount of compression but generally smaller than the original. Though, no one would use x264 for that today, its old and useless. VP9 or x265 at a minimum, both of which are less sensitive to thread count.
You can still quite easily notice difference even at higher score values, like >98.x. However if that noticeable difference matters or not is a question thats rather subjective.
They also do not produce all the scores and metrics vmaf has. The "score" is just mean value. Min, Max, Mean, Harmonic mean is the generally values you get out. I find all of them more important than the Mean in most cases. Since the difference between them can be rather large. See this example of the output you get:

JSON:
    <metric name="motion2" min="0.000000" max="17.513954" mean="9.934909" harmonic_mean="5.748033" />
    <metric name="motion" min="0.000000" max="20.251728" mean="10.346691" harmonic_mean="6.487637" />
    <metric name="adm2" min="0.440108" max="0.938360" mean="0.731659" harmonic_mean="0.711730" />
    <metric name="adm_scale0" min="0.718875" max="0.910788" mean="0.815849" harmonic_mean="0.814638" />
    <metric name="adm_scale1" min="0.438301" max="0.895599" mean="0.670334" harmonic_mean="0.654797" />
    <metric name="adm_scale2" min="0.327092" max="0.939129" mean="0.684322" harmonic_mean="0.650081" />
    <metric name="adm_scale3" min="0.387338" max="0.970459" mean="0.775667" harmonic_mean="0.750280" />
    <metric name="vif_scale0" min="0.014387" max="0.408880" mean="0.180496" harmonic_mean="0.164509" />
    <metric name="vif_scale1" min="0.041550" max="0.835502" mean="0.434883" harmonic_mean="0.372053" />
    <metric name="vif_scale2" min="0.067117" max="0.913851" mean="0.534008" harmonic_mean="0.461327" />
    <metric name="vif_scale3" min="0.113093" max="0.950210" mean="0.615081" harmonic_mean="0.546858" />
    <metric name="vmaf" min="5.100550" max="87.312631" mean="43.527449" harmonic_mean="26.256242" />
And of course, one doesn't really use bitrate arguments these days unless you have a very specific purpose, constant rate factor is where its at for producing far superior results.

I did test nvenc encoding vs CPU encoding in PowerDirector at the start but I had such a small sample size that I decided to wait until later. Though if anything, GPU encodes consistently average higher scores in the 5 test videos I used at the time, further testing is needed.
Although the work on said scripts are halted for learning python so I can better produce a cross-platform tool for measurements and crunching and all other fancy stuff i wanted to do with it.

Other notes I found so far:
Most video files that many modern games have these days are way too large for their own good.
For those who like anime - Not a single fan sub group is able to produce reasonable quality to filesize videos (they are~80% larger than they should be), Exceptions are some of the few x265 recoders out there.
VP9 is super nice to use compared to x265, though x265 is nice and speedy and x264 is just laughable today in how horrible it is.
If you use VP9 then enabling tiling and throwing extra threads at it for higher encoding speeds does not impact quality at all

Here are some of the later results I had, and that is indeed a file ending up as 15% of the original size with a metric of 94 for both mean and harmonic mean. Though as one can see, the minimum values are starting to drop, and that can be noticeable.
So many things today are just made with static bitrate values with no concern for filesize and because of that, a lot of videos are rather bloated today. Which was why I started to play around with this to start with a few months back. Because I was curious how bloated (or non-bloated) things were.

size in bytes​
min​
max​
mean​
harmonic_mean​
ratio​
Percent of original size​
original size​
1447982328​
322606578​
81.591395​
100​
95.671001​
95.613573​
2.96556262408264E-07​
22.28%​
306821699​
80.697832​
100​
95.578825​
95.518704​
3.11512599374531E-07​
21.19%​
288320663​
80.11331​
100​
95.464203​
95.40166​
3.31104271219021E-07​
19.91%​
269377784​
79.701384​
100​
95.353233​
95.287672​
3.53975860904699E-07​
18.60%​
258002177​
78.624644​
100​
95.238739​
95.170078​
3.69139284433247E-07​
17.82%​
244100458​
78.393402​
100​
95.130818​
95.059777​
3.89719948825332E-07​
16.86%​
222567224​
77.236843​
100​
94.895375​
94.817218​
4.2636724893509E-07​
15.37%​
 

Sipu

New Member
Joined
Apr 1, 2022
Messages
4 (0.01/day)
So integrated gpu is worse? Or better? Is smart tv worse then than... Like shield or some other android box? This raises more questions than it answers
 
Joined
Mar 28, 2019
Messages
81 (0.04/day)
Wait - I’ve always thought that software encoding via CPU was the best solution (but not the fastest) at least with my 1050Ti card.
The review here seems to suggest otherwise?
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
These ASICs are common between different GPUs of the same generation aren't they? So regardless if you're using a 2060 or a 2080 Ti, wouldn't the encoding/decoding block still perform the same since it's the same circuit in both GPUs? I would expect something similar with RDNA2 chips with the same VCN block.
 
Joined
Apr 24, 2020
Messages
2,563 (1.75/day)
Wait - I’ve always thought that software encoding via CPU was the best solution (but not the fastest) at least with my 1050Ti card.
The review here seems to suggest otherwise?

CPU still has highest VMAF in the tests as described in the article.

But to get there, you need to run the CPU encoder under 'very slow' settings. For streaming, that's too slow to be useful. So they used 'very fast' settings (worse quality) for the streaming test
 
Joined
Sep 18, 2005
Messages
305 (0.04/day)
CPU still has highest VMAF in the tests as described in the article.

But to get there, you need to run the CPU encoder under 'very slow' settings. For streaming, that's too slow to be useful. So they used 'very fast' settings (worse quality) for the streaming test
i run 1080p slow and even medium x264 for live streaming and its very useful
 
Top