Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th November 2018, 19:18   #6481  |  Link
RainyDog
Registered User
 
Join Date: May 2009
Posts: 184
For 2-pass encodes, I normally use a custom faster 1st pass command line which is the same as my slow 2nd pass just with RDO level and subme turned down to level 2, --me dia, --early-skip and --fast-intra.

But I've been testing using identical command lines for both passes and using --multi-pass-opt-analysis instead which speeds up the 2nd pass considerably to the point where a complete 2-pass encode is almost the same speed as my usual approach.

Which should technically yield the higher quality final result? Is there any potential harm to using --multi-pass-opt-analysis?
RainyDog is offline   Reply With Quote
Old 13th November 2018, 17:52   #6482  |  Link
Majorlag
Registered User
 
Join Date: Jul 2016
Posts: 19
Quote:
Originally Posted by RainyDog View Post
Which should technically yield the higher quality final result? Is there any potential harm to using --multi-pass-opt-analysis?
Don't forget to also include --multi-pass-opt-rps --multi-pass-opt-distortion to your command line as well.
I understand that if your NOT turning down --RDO, --me and other settings then it should produce better results since it will spend more time on those settings in first pass. The --mulit-pass options are great in reusing the values obtained in the first pass to increase the speed of the second pass.
Majorlag is offline   Reply With Quote
Old 15th November 2018, 04:09   #6483  |  Link
atrin
Registered User
 
Join Date: Oct 2018
Posts: 1
How does lambda and lambda2 tables work in hevc?

Hi,

I have some sample tabels for lamda2 and I generated a table of lambda based on it. This is the address of my sample https://mailman.videolan.org/piperma...ch/010936.html
The second table is not related with its formula (lambda2 = 0.038 * pow(0.234, QP))

is there any document or information that explains lambda and lambda2 tables and relations?

Many thanks
atrin is offline   Reply With Quote
Old 18th November 2018, 13:17   #6484  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 697
Problems with metric VMAF.

After many hours, I managed to adjust the items VMAF. The new addition even recalculates something.

Read input model (libsvm) at ./vmaf_rb_v0.6.2/vmaf_rb_v0.6.2.pkl.model ...

Initialize storage arrays...
Extract atom features...
frame: 0, adm: 0.986, adm_num: 792.386, adm_den: 803.249, adm_num_scale0: 102.293, adm_den_scale0: 105.547, adm_num_scale1: 148.311, adm_den_scale1: 151.745, adm_num_scale2: 230.474, adm_den_scale2: 232.816, adm_num_scale3: 311.307, adm_den_scale3: 313.141, motion: 0.000, motion2: 0.000, vif_num_scale0: 3201540.000, vif_den_scale0: 4265149.500, vif_num_scale1: 915769.438, vif_den_scale1: 973015.313, vif_num_scale2: 237178.438, vif_den_scale2: 244942.781, vif_num_scale3: 62793.887, vif_den_scale3: 64002.453, vif: 0.796,
Generate final features (including derived atom features)...
Normalize features, SVM regression, denormalize score, clip...
frame: 0, adm2: 0.986477, adm_scale0: 0.969174, adm_scale1: 0.977376, adm_scale2: 0.989942, adm_scale3: 0.994142, motion: 0.000000, vif_scale0: 0.750628, vif_scale1: 0.941167, vif_scale2: 0.968301, vif_scale3: 0.981117, vif: 0.796321, motion2: 0.000000,
Exec FPS: 2.742952
VMAF score (mean) = 100.000000
x265 [info]: frame I: 1, Avg QP:23.64 kb/s: 5696.20
x265 [info]: frame P: 2, Avg QP:29.09 kb/s: 874.10
x265 [info]: frame B: 7, Avg QP:35.33 kb/s: 204.03
x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0%
x265 [info]: Weighted B-Frames: Y:0.0% UV:0.0%
x265 [info]: consecutive B-frames: 33.3% 0.0% 0.0% 33.3% 33.3% 0.0% 0.0% 0.0% 0.0%


However, how to use it? So many ads on the forum.

static const x265_vmaf_commondata vcd_yuv420p[] = { { (char *)"yuv420p", (char *)"./vmaf_rb_v0.6.2/vmaf_rb_v0.6.2.pkl", (char *)"vmaf_yuv%04d.json", (char *)"json", 0, 1, 1, 0, 0, 0, 0, (char *)"mean", 0, 3, 1 } };

The first two items are obvious. They concern the color of subsampling and the version VMAF. Due to the fact that I chose version 0.6.2, the last item 'enable_conf_interval' must be included.
Then, the recording items metric VMAF in the files json or xml. These are the next two positions from the left. Here are the problems. First of all, I don't know why the program doesn't save all parameters in one file. Secondly, I can't force a program to save json/xml files one after another to the number of processed frames. (vmaf_yuv%04d)
{
"version":"1.3.7",
"params":{
"model":"",
"scaledWidth":1920,
"scaledHeight":1080,
"subsample":3
},
"metrics":[
"adm2",
"bagging",
"ci95_high",
"ci95_low",
"motion2",
"stddev",
"vif_scale0",
"vif_scale1",
"vif_scale2",
"vif_scale3",
"vmaf"
],
"frames":[
{
"frameNum":0,
"metrics":{
"adm2":0.98648,
"bagging":99.62585,
"ci95_high":100.0,
"ci95_low":98.12069,
"motion2":0.0,
"stddev":0.7394500000000001,
"vif_scale0":0.75063,
"vif_scale1":0.94117,
"vif_scale2":0.9683,
"vif_scale3":0.98112,
"vmaf":100.0
}
}
]
}

Next, what is the items 'disable_clip' and 'enable_transform' for?
The next four items phone_model, psnr, ssim, ms_ssim should be turned off.
Choice of data processing method
Choosing the number of cores. In my case, zero.
Problem with the color n_subsample parameter. For BPG, once there are three for YUV, once there should be one for the alpha color. The instruction is five.

Ok, I created x265 files with VMAF and without:
- The X265 VMAF codec doesn't work with FFmpeg.
av_interleaved_write_frame(): Broken pipe
No more output streams to write to, finishing.
Error writing trailer of pipe:: Broken pipe

Code:
ffmpeg.exe -loglevel verbose -i Untitled.mp4 -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=bt709:in_range=limited:out_color_matrix=bt709:out_range=limited,format=yuv420p -strict -1 - | 
x265_081012bit_hdr_vmaf.exe --y4m --input-csp i420 --input-depth 8 --output-depth 8 --preset veryslow --crf 28 --fps 25.000 --keyint 50 --info --no-open-gop 
--colormatrix bt709 --colorprim bt709 --transfer bt709 --limit-ref 0 --range limited --recon 111.yuv --output 111.h265 -
- I don't know what is the 'recon' function for VMAF for?
In the description:
-r/--recon <filename> Reconstructed raw image YUV or Y4M output file name

- Strange, the x265 vmaf itself works, but it isn't known whether the codec should have an output file or not?
Assuming he has. This file does not differ in content from the recon file. In addition, these files don't differ from x265 files without VMAF. I don't have a concept for what it is and what is the recon file for?

Last edited by Jamaika; 19th November 2018 at 12:04.
Jamaika is offline   Reply With Quote
Old 19th November 2018, 09:54   #6485  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,752
The "recon" feature writes a YUV or Y4M raw video file that contains the reconstructed video which has been decoded right after encoding it, so you can compare the compression results with the original source (assuming it was a YUV or Y4M file too) without calling an additional decoder. It is available independently of VMAF functions linked into x265 – which may still be possible only under Linux, I believe; are you sure your Windows build contains any VMAF comparison code? The build script source\CMakeLists.txt contains the check clearly in a "if(UNIX)" block.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 19th November 2018, 11:55   #6486  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 697
Quote:
Originally Posted by LigH View Post
... which may still be possible only under Linux, I believe; are you sure your Windows build contains any VMAF comparison code? The build script source\CMakeLists.txt contains the check clearly in a "if(UNIX)" block.
I created a version for Windows 2.9+8. I almost doesn't change anything. Codec hasn't only 'threads' for VMAF as I wrote earlier.
https://www.sendspace.com/file/r90y0d
Probably it can also be created in MSVC.

Last edited by Jamaika; 19th November 2018 at 11:57.
Jamaika is offline   Reply With Quote
Old 19th November 2018, 18:40   #6487  |  Link
Barough
Registered User
 
Barough's Avatar
 
Join Date: Feb 2007
Location: Sweden
Posts: 480
x265 v2.9+8-27d8424c799d (32 & 64-bit 8/10/12bit Multilib Windows Binaries)

Code:
https://bitbucket.org/multicoreware/x265/commits/branch/stable
Barough is offline   Reply With Quote
Old 21st November 2018, 11:54   #6488  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
Quote:
Originally Posted by Wolfberry View Post
(64-bit GCC 8.2.0 8+10+12bit multilib / ICC 19.0 8/10/12 cli+shared)
ICC binaries not working in my Win10 (missing dll's).
Attached Images
 
Ma is offline   Reply With Quote
Old 22nd November 2018, 00:58   #6489  |  Link
Natty
Noob
 
Join Date: Mar 2017
Posts: 221
qcomp

hi, i would like to know how qcomp works in simple language, and it's impact on bitrate when its lowered or increased from its default value
Natty is offline   Reply With Quote
Old 22nd November 2018, 09:11   #6490  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,752
--qcomp <float>
Quote:
qComp sets the quantizer curve compression factor. It weights the frame quantizer based on the complexity of residual (measured by lookahead). It’s value must be between 0.5 and 1.0. Default value is 0.6. Increasing it to 1.0 will effectively generate CQP.
The default value 0.6 is a balance between a constant quantizer (regardless of the video content) and the complexity of the video content (degree of details and amount of motion) providing chances to spare bitrate by increasing the quantizer slightly in scenes where it may be sufficient to preserve enough quality with little noticeable loss.

IIRC, if you could decrease it to 0.0, the encoder would try its best to keep a constant bitrate (CBR), which would cause a very varying amount of quality loss (I might be wrong here, for x265, though). Increasing it to 1.0 instead would cause a constant quantization which would not take advantage of the possible ways to spare bitrate in scenes where convenient quality preservation could already be achieved with less bitrate, at a coarser quantization than the target.

You may increase this value a little (e.g. towards 0.8) when you notice that there is too much loss of precision in areas with very little detail, e.g. darkness and smooth ramps, especially in cases when your target bitrate is rather low. On the other hand, there may be other (psycho-visual) options to let the encoder not spare too much bitrate.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 22nd November 2018, 12:15   #6491  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
Quote:
Originally Posted by Wolfberry View Post
@Ma Test version available. If any of these works, some benchmarks will be appreciated.
x265_MT.exe works, thanks! I will make some test with '--no-asm' option to compare only C++ compilers.
Ma is offline   Reply With Quote
Old 22nd November 2018, 21:04   #6492  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
Test platform: Win10 64-bit home, i7 8700 + be quiet pure rock, 16 GB RAM DDR4 @ 3866
Command line (only 8-bit encoding):
x265 --no-asm --crf 20 ../Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m w.hevc

Results in fps (encoding speed, mean value from 2 runs):
8.98 fps -- ICC AVX2
8.41 fps -- GCC 9.0 AVX2 ucrt
7.36 fps -- GCC 8.2 AVX2
7.34 fps -- GCC 8.2 AVX2 ucrt
7.12 fps -- GCC 7.3 AVX2 ucrt
7.11 fps -- GCC 6.5 AVX2 ucrt
6.30 fps -- GCC 5.5 AVX2 ucrt
5.93 fps -- GCC 8.2 generic Barough build
5.57 fps -- GCC 4.9.4 AVX2 ucrt
5.06 fps -- VS 2017 AVX2
4.89 fps -- VS 2015 AVX2
4.72 fps -- GCC 4.8.5 AVX2 ucrt

ucrt means Universal CRT (it is replacement for msvcrt.dll)
Results with asm was 29 up to 30 fps for all contenders (full results in screen.txt).

ICC 19 is clear winner, GCC 9 in second place. VS 2017/2015 without asm are really slow (but with asm are good/the best).
Attached Files
File Type: txt screen.txt (59.8 KB, 64 views)
Ma is offline   Reply With Quote
Old 22nd November 2018, 23:55   #6493  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,345
Thanks Ma for those tests. Wow, that's a large % variation in speed
poisondeathray is offline   Reply With Quote
Old 23rd November 2018, 05:59   #6494  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,883
Quote:
Originally Posted by Ma View Post
Test platform: Win10 64-bit home, i7 8700 + be quiet pure rock, 16 GB RAM DDR4 @ 3866
Command line (only 8-bit encoding):
x265 --no-asm --crf 20 ../Bosphorus_1920x1080_120fps_420_8bit_YUV.y4m w.hevc

Results in fps (encoding speed, mean value from 2 runs):
8.98 fps -- ICC AVX2
8.41 fps -- GCC 9.0 AVX2 ucrt
7.36 fps -- GCC 8.2 AVX2
7.34 fps -- GCC 8.2 AVX2 ucrt
7.12 fps -- GCC 7.3 AVX2 ucrt
7.11 fps -- GCC 6.5 AVX2 ucrt
6.30 fps -- GCC 5.5 AVX2 ucrt
5.93 fps -- GCC 8.2 generic Barough build
5.57 fps -- GCC 4.9.4 AVX2 ucrt
5.06 fps -- VS 2017 AVX2
4.89 fps -- VS 2015 AVX2
4.72 fps -- GCC 4.8.5 AVX2 ucrt
.
Very interesting.
I knew that Intel Parallel Studio (and its compiler) was good, but what surprises me is that GCC has become better and better.
Visual Studio used to be good for AVX2, while GCC used to be better for SSE2/SSSE3/SSE4.1, but perhaps things have changed and GCC now totally outperforms Visual Studio.
FranceBB is offline   Reply With Quote
Old 23rd November 2018, 09:15   #6495  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
Do keep in mind that these tests are absolutely disjunct from reality. Noone is going to run something like x265 without ASM, so for any real-world use these numbers are meaningless.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 26th November 2018, 07:17   #6496  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,883
Quote:
Originally Posted by nevcairiel View Post
Do keep in mind that these tests are absolutely disjunct from reality. Noone is going to run something like x265 without ASM, so for any real-world use these numbers are meaningless.
Well, of course.
Still, in an ideal world, compilers would be able to produce optimized assembly code as fast as manually-written intrinsics, so there's no need to manually write them.

Unfortunately, that's still an utopia.

Anyway, for 10/12bit x265 on x86 32bit systems (for which there aren't manually written intrinsics available and builds rely on compiler optimization only), compilers "speed tests" are kinda useful. ^_^
FranceBB is offline   Reply With Quote
Old 26th November 2018, 10:38   #6497  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
Quote:
Originally Posted by FranceBB View Post
Still, in an ideal world, compilers would be able to produce optimized assembly code as fast as manually-written intrinsics, so there's no need to manually write them.

Unfortunately, that's still an utopia.
And it will always remain nothing but a dream. A compiler does not have enough information about the restrictions and requirements of the algorithm to perform the same sort of optimization a developer can do when manually writing ASM - especially with advanced SIMD.

PS:
Anyone that runs on a 32-bit system deserves what they get. Upgrade already, stop wasting developers and your own time. A simple change from 32-bit to 64-bit on the same hardware will yield a massive speedup already. And if you're encoding with x265 on hardware thats not even 64-bit compatible, then you should *really* upgrade.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 26th November 2018 at 10:40.
nevcairiel is offline   Reply With Quote
Old 26th November 2018, 18:42   #6498  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by nevcairiel View Post
And it will always remain nothing but a dream. A compiler does not have enough information about the restrictions and requirements of the algorithm to perform the same sort of optimization a developer can do when manually writing ASM - especially with advanced SIMD.
Yeah, QFT++. Autovectorization has been a dream for decades, and it’s never gotten anywhere near what hand assembly can do. Same with autoparallelization. A good compiler and the right code can maybe get 2-3x faster. Intel’s whole Itanium debacle was premised on, and failed because of, very over optimistic assumptions about compilers being able to do this stuff.

Apple switching to Intel was such a boon to the industry because it eliminated the need for media apps to have to implement on both SSEx and AltiVec. Because you couldn’t even port between them; often the whole algorithm had to be refactored to get decent performance.

As it is, x265 is probably some of the most advanced and complex SIMD code on the planet, with a lot of complex threading to boot. It’s likely about the worse case for a speed gap between compiler-generated SIMD versus hand-coded SIMD.

Quote:
Anyone that runs on a 32-bit system deserves what they get. Upgrade already, stop wasting developers and your own time. A simple change from 32-bit to 64-bit on the same hardware will yield a massive speedup already. And if you're encoding with x265 on hardware thats not even 64-bit compatible, then you should *really* upgrade.
Are people actually still doing this? I can’t imagine how slow x265 must be on pre x64 hardware. I don’t think I’ve had a machine NOT running 64-bit since Windows 7 launched, and everything I was running when Win 7 launched was already 64-bit capable.

The joules per pixel on a pre x264 system has to be a couple of orders of magnitude worse than the latest Intel and AMD processors deliver. And upgrade would pay for itself quickly in lowered elecctricity & cooling costs alone!
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 26th November 2018, 20:24   #6499  |  Link
qyot27
...?
 
qyot27's Avatar
 
Join Date: Nov 2005
Location: Florida
Posts: 1,419
Quote:
Originally Posted by benwaggoner View Post
Are people actually still doing this? I can’t imagine how slow x265 must be on pre x64 hardware. I don’t think I’ve had a machine NOT running 64-bit since Windows 7 launched, and everything I was running when Win 7 launched was already 64-bit capable.
It's been a long, long time since I ever tried, although I do keep a full set of 32-bit build instructions for all the pieces in my FFmpeg/mpv build guide. For posterity, mostly.

My guess would be that you could get decent encode times with x265 on a PIII only by encoding at most 480p under --profile ultrafast at 8bit (since the >8bit ASM has to be explicitly turned off to build for 32-bit). And possibly not even then, as I'm pretty sure we'd still be looking at maybe 5fps. At that point you're dealing with strictly academic 'because I can' types of things, and you'd absolutely get better framerates (at a preciser preset) by just using x264 instead.

Quote:
The joules per pixel on a pre x264 system has to be a couple of orders of magnitude worse than the latest Intel and AMD processors deliver. And upgrade would pay for itself quickly in lowered elecctricity & cooling costs alone!
Exactly. My Coppermine system was only a main system until 2015, and it only held out that long because of not having an income up until then (it is still alive, though, but now serves as a file archive).

Inexpensive mini-PCs have really filled the gap here, and get vastly better performance than an ancient system like that would get. Even with the power draw restraints and lack of AVX 1 or 2, Bay Trail-T (and now Apollo Lake, since I inadvertently fried the other one) could run circles around the Coppermine while being dead silent because the power consumption is so low they don't even need a cooling fan. Plus access to better SIMD - up to SSE4.2, plus the AES stuff - and multithreading. Running 64-bit OSes can be a bit of a task - the Bay Trail-T era would normally only ship with 32-bit versions of Windows on tiny eMMC storage (and since they come with 32-bit UEFI, you can't use 64-bit Windows, although 64-bit Linux distros loaded from flash drives or external USB hard drives are an option), but by now 64-bit Windows installs seem common, along with allowing for putting secondary SSDs into the system.

I've done 4K->4K and 4K->1080p transcodes on the Apollo Lake at about 5fps (ultrafast, 10bit, preserving HDR, crf 18), and at least Apollo Lake has a 10bit HEVC decoder in the GPU. Had I known the exact way to get that enabled in mpv at the time, I wouldn't have even bothered trying to transcode. But it let me get working figures, and I suppose 1080p 10bit HEVC would be less of a burden on the GPU than 4K would, so there's that.

Last edited by qyot27; 26th November 2018 at 20:30.
qyot27 is offline   Reply With Quote
Old 27th November 2018, 17:24   #6500  |  Link
aegisofrime
Registered User
 
Join Date: Apr 2009
Posts: 478
Quote:
Originally Posted by Wolfberry View Post
x265 v2.9+8-27d8424c799d
(64-bit GCC 8.2.0 8+10+12bit multilib / ICC 19.0 8/10/12 cli+shared)
Apologies, I'm only seeing a v2.9+9 build on that, and that's built with GCC 8.2. I can't find an ICC build in that Google Drive, or am I blind?
aegisofrime is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.