Welcome to this thread regarding the evaluation of HEVC decoders (
HW,
Hybrid and
SW)
The
latest update with
new samples for evaluating
pure fixed-function HW HEVC decoders is
here testing
Kabylake HD 610,
Skylake HD 530,
nVidia Turing GTX 1660 Nvidia Pascal GTX 1050 Ti/1060,
Nvidia Maxwell 3rd gen GTX 960 and
AMD Polaris RX 470 -
01/03/2017
Added some info on
VP9 acceleration support of
browsers using
Youtube on various hardware (including Polaris cards)
here -
22/07/2017
The first official confirmation of
Skylake's (HD 530) 8K (!) HEVC 8bit decoding capability in pure HW decoding mode
here by Yups -
05/09/2015
Here is the comparison between
Skylake HD 530 and
Nvidia GTX 960 -
24/08/2015
Probably world's first
Skylake 4K HEVC 8bit (HD 530) decoding results
here and
here and
here by Yups: (looks like the world's fastest HW decoder for 8bit HEVC) -
22/08/2015
For updated
Optimus results, take a look
here -
27/05/2015
For those interested in
hybrid only results on Intel's iGPU (DXVA, OpenCL), look at the results of the new
PowerDVD 15 x86 (
DXVA, OpenCL) and latest
LAV 0.65/MPC 1.4.4.286 DXVA x64/x86 decoders
here -
22/04/2015
New updated
CPU results for latest
Lentoid v2.1.0.0, LAV v0.67.3 and
MS H.265 decode (build 10586) here -
29/11/2015
For those interested in
CPU only results, look at the results of the new
Lentoid x64/x86 v2.0.3.2,
PowerDVD 15 on various CPU architectures and systems including
Win 10 Build 10049 using latest
LAV 0.65/MPC 1.4.4.286 decoders.
Also added a
10bit HEVC video
here -
22/04/2015
Added new results for the
first public release of
UHDcode, the
HEVC decoder of MultiCoreWare here -
01/04/2015
For
HW decoder results of
Nvidia 960 GTX, you have to see the post from Nevcairiel
here -
23/01/2015
(these are the world's first benchmark results of the fixed-function HEVC decoder inside GTX 960)
For a little older tests -
14/01/2015 - including
PowerDVD v14 decoder,
Microsoft's MFT Windows 10 decoder, a
10bit clip, on
various CPU architectures (from Core 2 Duo up to Core i7 Haswell) take a look
here
The tool I use is always
DXVA Checker you can download from
here
The first post (23/09/2014) starts here:
I used latest DXVA Checker x86/x64 version
v3.1.2 using
DXVA processing and a scaling resolution of
1280x720 for all clips.
1080p clips
1.ProRes-1080p@30fps-2Mbps
https://www.sendspace.com/file/mo8exg
LAV x64 i7-4790 158/
230/251
LAV x86 i7-4790 121/
151/188
LAV x64 i7-4790 (DXVA) 83/
134/157
LAV x86 i7-4790 (DXVA) 84/
133/158
Strongene x86 i7-4790 (OpenCL) 83/
112/146
Strongene x86 i7-4790 97/
109/111
LAV x64 i5-4200M 76/
98/114
LAV x64 i5-4200M (DXVA) 60/
84/96
LAV x64 Core 2 Quad 45/
52/54
2.Elephants Dream-1080p@24fps-1.7Mbps
http://www.libde265.org/hevc-bitstre...1080-cfg02.mkv
LAV x64 i7-4790 93/
195/242
LAV x64 i7-4790 (DXVA) 78/
138/204
LAV x86 i7-4790 (DXVA) 66/
135/200
LAV x86 i7-4790 62/
133/259
Strongene x86 i7-4790 66/
107/116
LAV x64 i5-4200M 40/
97/130
LAV x64 i5-4200M (DXVA) 44/
84/112
LAV x64 Core 2 Quad 35/
52/55
Strongene x86 i7-4790 (OpenCL) Crashed
3.Big Buck Bunny-1080p@60fps-2Mbps
http://www.libde265.org/hevc-bitstre...1080-cfg06.mkv
LAV x64 i7-4790 144/
232/267
LAV x64 i7-4790 (DXVA) 123/
165/230
LAV x86 i7-4790 (DXVA) 113/
161/201
LAV x86 i7-4790 67/
146/259
LAV x64 i5-4200M 66/
109/131
LAV x64 i5-4200M (DXVA) 68/
97/114
LAV x64 Core 2 Quad 45/
53/55
Strongene x86 i7-4790 (OpenCL) Crashed
Strongene x86 i7-4790 Crashed just before the end of the file
UHD - 3840x2160p clips
1.Beauty-2160p@30fps-12.3Mbps
http://ultravideo.cs.tut.fi/video/Be...t_HEVC_MP4.mp4
LAV x64 i7-4790 51/
63/65
Strongene x86 i7-4790 (OpenCL) 26/
47/47
LAV x86 i7-479025/
40/43
Strongene x86 i7-4790 20/
34/35
LAV x64 i7-4790 (DXVA) 29/
33/35
LAV x86 i7-4790 (DXVA) 30/
32/33
LAV x64 i5-4200M 15/
25/27
LAV x64 i5-4200M (DXVA) 19/
21/22
LAV x64 Core 2 Quad 6/
13/13
2.Fitness-2160p@30fps-8Mbps
http://cloud.ultrahdtv.net/fitness-trailer-8000.mkv
LAV x64 i7-4790 57/
71/82
LAV x64 i7-4790 (DXVA) 41/
55/77
LAV x86 i7-4790 (DXVA) 40/
52/76
LAV x86 i7-4790 35/
48/71
Strongene x86 i7-4790 28/
40/50
LAV x64 i5-4200M (DXVA) 25/
30/39
LAV x64 i5-4200M 22/
29/38
LAV x64 Core 2 Quad 9/
13/13
Strongene x86 i7-4790 (OpenCL) Crashed
3.Ducks-2160p@50fps-4Mbps
https://www.sendspace.com/file/cyiv49
LAV x64 i7-4790 62/
72/73
Strongene x86 i7-4790 (OpenCL) 29/
46/50
LAV x86 i7-4790 (DXVA) 43/
45/47
LAV x64 i7-4790 (DXVA) 42/
45/47
Strongene x86 i7-4790 27/
41/42
LAV x86 i7-4790 29/
40/44
LAV x64 i5-4200M 25/
31/33
LAV x64 i5-4200M (DXVA) 27/
29/30
LAV x64 Core 2 Quad 10/
13/13
Comments:
I used latest nightly of
LAV filters 0.62.46 (24-09-2014) with LAV Video threads set to
Auto and latest Strongene's OpenCL and CPU HEVC decoders.
The
Core i7 system tested was:
Win 8.1 Pro x64 - Core i7-4790 - HD 4600 - Drivers v.3907
The
Core i5 system tested was:
Win 8.1 x64 - Core i5-4200M@2.5GHz (battery mode) - HD 4600 - Drivers v.3907 - Nvidia 740M - Drivers 344.11 - Optimus technology
The
Core 2 Quad system tested was:
Win 8.1 Pro x64 - Core 2 Quad Q9550@2.26GHz (266MHz x 8.5) - Nvidia GT610 - Drivers 344.11
For laptop and Nvidia hybrid decoder, see my post
here
For tests using more threads than Auto, null renderer, overclocked CPUs and different OS (Win 7) check
this post:
CPU decoders for Core i7
The CPU decoders were using GPU as low as possible ~25%-35% with 600MHz clock, but Strongene's CPU was a little higher ~80%@600MHz
1080p
For
LAV x86/x64 CPU decoders, CPU utilization was
55% - 60% using all 8 threads with a clock of 3.8GHz, for
Strongene CPU decoder it was only
16% !!! with a clock of ~4.0GHz with only a few threads enabled.
2160p
For
LAV x64 CPU decoder, CPU utilization was
77% - 91% using all 8 threads with a clock of 3.8GHz, for
LAV x86 CPU CPU utilization was
92% - 94% using all 8 threads with a clock of 3.8GHz, for
Strongene CPU decoder it was only
25% -29% !!! with a clock of ~4.0GHz with only a few threads enabled.
CPU decoders for Core i5/ Core 2 Quad
I used only LAV x64 decoder
and the results are completely weird.
Core i5-4200M@2.5GHz is a
dual core processor with four threads
(2C/4T) and achieved a CPU utilization of
76% - 86% for
1080p and
76% - 89% for
2160p
Core 2 Quad Q9550@2.26GHz is a
quad core processor with four threads
(4C/4T) and achieved a CPU utilization of
50% - 58% for
1080p and
55% - 60% for
2160p
For all clips the performance of the dual core Core i5@2.5GHz is
almost double than the quad Core 2 Quad@2.26GHz with more
than 50% CPU utilization (!) for the dual core
(2C/4T) chip (!!)
GPU decoders for Core i7
1080p
The
DXVA decoders were using
GPU at ~100% and
max 1200MHz clock and
CPU at ~14% with a clock of 2.6GHz - 4.0GHz
Strongene's OpenCL decoder used GPU like Strongene's CPU decoder - only a few MHz higher (600MHz - 750MHz) and a CPU usage of ~11% at 4.0GHz
2160p
The
DXVA decoders were using
GPU at ~90% and
max 1200MHz clock and
CPU at ~14% with a clock of 2.6GHz - 4.0GHz.
The memory usage went to the limit of ~960MB-990MB, the maximum that iGPU can use.
Strongene's OpenCL decoder used GPU a lot more than 1080p at 85%@850MHz - 900MHz and the CPU usage of 36% - 50% at 3.8GHz-4.0GHz
which is double than Strongene's CPU decoder!
GPU decoders for Core-i5
See
here
----------------------------------------------------------------------------------------------------------------------------------
LAV DXVA x86 and x64 have almost the same performance and in 2160p they are slower than OpenCL decoder, mainly because of the CPU usage of OpenCL decoder and not GPU.
It's interesting that LAV DXVA x86/x64 is faster than LAV CPU x86 in 1080p clips and low bitrate 2160p clips.
But in just 12Mbps 2160p clip, the LAV x86 CPU is faster than DXVA.
During benchmarking LAV CPU decoders were using more power (watt) than DXVA or OpenCL decoder, but I think
during playback the opposite occurs (I haven't tested yet)
I used latest Intel's GPA tool to monitor any QuickSync (fixed-function HW) activity, but it was a dead zero during DXVA decoding.
So everything is
decoded in EUs and a little CPU
for HEVC DXVA2 decoder.
----------------------------------------------------------------------------------------------------------------------------------
Conclusion
LAV DXVA uses maximum iGPU memory of 1GB for 4K decoding and with a low bandwidth of even 12Mbps is slower even than LAV x86 CPU.
It's more useful for lower resolutions.
Strongene's OpenCL decoder is definitely a lot more useful for 4K than 1080p, but it has incompatibilities and in order to be fast at 4K it uses a lot the CPU, not so much the GPU.
Strongene's CPU decoder is almost always slower of all.
LAV x64 CPU is always faster than anything else in all bitrates and resolutions.
CPU utilization of all CPU decoders, rises with the increased resolution and low bitrate.
Best example is Duck 2160p@4Mbps clip with LAV x64 CPU utilization of 91% and LAV x86 at 94%
When bandwidth increases a little, the CPU utilization and decoding performance drops more.
For all decoders I used a good case scenario of a 1280x720 display resolution.
When native resolution is used for display at 1080p and 2160p, the results are lower by a good percentage.
For 4K BluRay with a bitrate of 100Mbps and 10 bit resolution, expect CPU decoders and hybrid decoders to be useless (?)* even with Haswell-E or Xeon processors.
We are going to need definitely pure fixed-function HW decoders for 4K BluRay.
On the other hand, 4K BluRay will appear on winter holidays of 2015, so until then, CPU and hybrid decoders are just fine for low bandwidth and low fps clips that HEVC is the best codec to use.
I have already encoded 4K H.264 and 4K H.265 up to 600Mbps (!) and I can say for sure that 4K H.264 performance of Haswell QuickSync decoder is about 190fps for 4K H.264 100Mbps clip at 1280x720 display resolution.
Intel and Nvidia decided to offer a hybrid solution now, which is a useful choice for low fps, low bandwidth clips even at 4K resolution and by the time of 4K BluRay arrival, fixed-function decoders will be ready.
Hopefully AMD will follow.
* according to
these results by cyberbeing an overclocked CPU - even without HT - and a real fast Nvidia card can handle even more difficult to decode 4K HEVC clips at greater speed than my results.
But still 4K BluRay will be out of their reach I think.