Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 29th May 2019, 09:38   #1  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
Quality evaluation I cannot fully understand

With a general idea the Quantization is the only processing that can be lossy, I thought the encode quality depends on it only. I thought the Prediction determines the encode size but not the quality.

A simple test proofs me wrong.
I’ve run two encodes with different presets: --preset veryfast and –preset veryslow and all other options the same with –tune psnr –qp 8 –ipratio 1.0 –pbratio 1.0 on a 1224 frames. The encodes are 10-bit 1080p HDR from 2160p HDR source Blade Runner 2049 frames: 217348-218574.

The encoding results are shown in the attachment.
I've evaluated the encodes' quality through FFmpeg filter PSNR using SQRT(mse), as more intuitively understandable metric.
The test results show worse quality for --preset veryfast and bigger file size for --preset veryslow
Code:
				SQRt(mse)		PSNR
			avg	y	u	v	avg	y
--preset veryfast	1.756	1.909	1.537	1.246	56.490	55.250
--preset veryslow	1.558	1.660	1.442	1.210	57.040	55.860
veryslow sqrt(y)/veryfast sqrt(y) ratio = 0.869 or veryslow sqrt(y) is 13% less

--preset veryslow: 144,751,026 bytes vs --preset veryfast: 122,153,342 bytes or --preset veryslow file size is 18.5% bigger.

I can understand these results but cannot explain them. Or which options actually and how are they determine the lower sqrt(me) value - the better quality - for the --preset veryslow.
Attached Files
File Type: txt encoding -debug mode- veryfast vs veryslow qp=8.txt (7.3 KB, 10 views)
SpasV is offline   Reply With Quote
Old 29th May 2019, 17:19   #2  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by SpasV View Post
With a general idea the Quantization is the only processing that can be lossy, I thought the encode quality depends on it only. I thought the Prediction determines the encode size but not the quality.

A simple test proofs me wrong.
I’ve run two encodes with different presets: --preset veryfast and –preset veryslow and all other options the same with –tune psnr –qp 8 –ipratio 1.0 –pbratio 1.0 on a 1224 frames. The encodes are 10-bit 1080p HDR from 2160p HDR source Blade Runner 2049 frames: 217348-218574.

The encoding results are shown in the attachment.
I've evaluated the encodes' quality through FFmpeg filter PSNR using SQRT(mse), as more intuitively understandable metric.
The test results show worse quality for --preset veryfast and bigger file size for --preset veryslow
Code:
				SQRt(mse)		PSNR
			avg	y	u	v	avg	y
--preset veryfast	1.756	1.909	1.537	1.246	56.490	55.250
--preset veryslow	1.558	1.660	1.442	1.210	57.040	55.860
veryslow sqrt(y)/veryfast sqrt(y) ratio = 0.869 or veryslow sqrt(y) is 13% less

--preset veryslow: 144,751,026 bytes vs --preset veryfast: 122,153,342 bytes or --preset veryslow file size is 18.5% bigger.

I can understand these results but cannot explain them. Or which options actually and how are they determine the lower sqrt(me) value - the better quality - for the --preset veryslow.
Way more things happen than simple quantization! Whole different tools get turned on in higher presets that aren't available at lower presets. Psychovisual modeling and optimization is applied. These might have an indirect impact on quantization, but it certainly isn't linear.

At higher presets, the correlation between QP and subjective quality probably worsens, since so many other ways to improve subjective quality get applied.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 29th May 2019, 19:16   #3  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
Thanks.
x265 is a complex software and it is not trivial to control the encoding process.
By trial and error method I've found the option --rdoq-level <0|1|2> to be crucial in this case.
Quote:
--rdoq-level
Specify the amount of rate-distortion analysis to use within quantization:
At level 2 rate-distortion cost is used to make decimate decisions on each 4x4 coding group, ...
I've set --rdoq-level 2 with --preset medium and I've got good encoder performance and quality.
Code:
				SQRT(mse)		PSNR
			avg	y	u	v	avg	y
--preset veryfast	1.756	1.909	1.537	1.246	56.490	55.250
--preset veryslow	1.558	1.660	1.442	1.210	57.040	55.860
--preset medium +	1.587	1.697	1.451	1.224	56.800	55.590
	--rdoq-level 2
SpasV is offline   Reply With Quote
Old 31st May 2019, 21:42   #4  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 3,530
Not that either of your metrics correlate particularly well with human perceptions of quality...

Have you tried using the slow preset? It is only one step slower than medium and it is the fastest preset that enables rdoq-level 2 by default. I have found the presets to be very good for quality v.s. speed tradeoffs.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 2nd June 2019, 14:34   #5  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
I'm not an encoding fan. If I do encode my criterion is max quality at acceptable size which can be fulfilled when resizing the source frame - for example from UHD to HD. At HD frame size and an average 0.6 - 0.7 bpp, which is the usual BluRay quality, the encode size is pretty good.
For this reason I don't rely on human perceptions because at high quality there are not differences for a human to see.

And for this reason I implement --tune psnr (disables adaptive quant, psy-rd, and cutree).

Never the less, what would you say about these specific options I've used:
Code:
--max-merge 4  --rdoq-level 2 --rd 4 --no-early-skip --rc-lookahead 40 --preset medium  --tune psnr --qp 10 --ipratio 1.1 --pbratio 1.2 --no-deblock --no-sao
which were qualified as Beyond bad settings.
Thanks.

Last edited by SpasV; 2nd June 2019 at 14:56.
SpasV is offline   Reply With Quote
Old 2nd June 2019, 17:40   #6  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 363
@SpasV. You should have some deblocking, at least: --deblock -3:-3
Because deblock not only deblocks the image, but also compresses the video more.
Forteen88 is offline   Reply With Quote
Old 2nd June 2019, 23:23   #7  |  Link
sonnati
Registered User
 
Join Date: Jun 2008
Posts: 17
Quote:
Originally Posted by SpasV View Post
With a general idea the Quantization is the only processing that can be lossy, I thought the encode quality depends on it only. I thought the Prediction determines the encode size but not the quality.

A simple test proofs me wrong.
I’ve run two encodes with different presets: --preset veryfast and –preset veryslow and all other options the same with –tune psnr –qp 8 –ipratio 1.0 –pbratio 1.0 on a 1224 frames. The encodes are 10-bit 1080p HDR from 2160p HDR source Blade Runner 2049 frames: 217348-218574.

The encoding results are shown in the attachment.
I've evaluated the encodes' quality through FFmpeg filter PSNR using SQRT(mse), as more intuitively understandable metric.
The test results show worse quality for --preset veryfast and bigger file size for --preset veryslow
Code:
				SQRt(mse)		PSNR
			avg	y	u	v	avg	y
--preset veryfast	1.756	1.909	1.537	1.246	56.490	55.250
--preset veryslow	1.558	1.660	1.442	1.210	57.040	55.860
veryslow sqrt(y)/veryfast sqrt(y) ratio = 0.869 or veryslow sqrt(y) is 13% less

--preset veryslow: 144,751,026 bytes vs --preset veryfast: 122,153,342 bytes or --preset veryslow file size is 18.5% bigger.

I can understand these results but cannot explain them. Or which options actually and how are they determine the lower sqrt(me) value - the better quality - for the --preset veryslow.
The reason why "very slow" provides higher PSNR than "very fast" even at fixed QP is because recovery of information from previous frames (motion estimation and compensation) is better for slower presets. This means that the remaining delta signal carries less info and the given QP eliminates less info. Therefore the final amount of info (prediction + quantized delta) is higher for slower presets.

It is less understandable why slower presets require so much more data rate at the same QP if delta signal is smaller...a part is probably due to higher signaling costs for more accurate motion estimation and compensation...but in your example there's something more

Last edited by sonnati; 2nd June 2019 at 23:28.
sonnati is offline   Reply With Quote
Old 3rd June 2019, 08:45   #8  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
Quote:
Originally Posted by sonnati View Post
The reason why "very slow" provides higher PSNR than "very fast" even at fixed QP is because recovery of information from previous frames (motion estimation and compensation) is better for slower presets. This means that the remaining delta signal carries less info and the given QP eliminates less info. Therefore the final amount of info (prediction + quantized delta) is higher for slower presets.

It is less understandable why slower presets require so much more data rate at the same QP if delta signal is smaller...a part is probably due to higher signaling costs for more accurate motion estimation and compensation...but in your example there's something more
Thanks.
What I've found was at --rdoq-level 2 the mse was better.
Quote:
--rdoq-level <0|1|2>
Specify the amount of rate-distortion analysis to use within quantization:

At level 0 rate-distortion cost is not considered in quant,
At level 1 rate-distortion cost is used to find optimal rounding values for each level,
At level 2 rate-distortion cost is used to make decimate decisions on each 4x4 coding group.
Level 2 is active at presets higher than medium.
Level 0 is active at presets lower than slow.

Probably, it is worth looking at the code.

Last edited by SpasV; 3rd June 2019 at 09:02.
SpasV is offline   Reply With Quote
Old 13th July 2019, 20:30   #9  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
x265-the compression efficiency with CRF and CQP

Quote:
x265 has ten predefined --preset options that optimize the trade-off between encoding speed (encoded frames per second) and compression efficiency (quality per bit in the bitstream).
I've decided to look at the compression efficiency considering the two Rate Control Modes - CRF and CQP.

In my understanding the native video compression rate control is Constant QP rate. It is “pure” mathematical method for video compressing. The CRF along with AQ and --psy-rd & --psy-rdoq aim uniform quality and improved perceived visual quality with relatively low quality encodes.

I’ve run simple comparison tests in order to get some estimations.

The setup is simplified. The source - Mad Max: Fury Road 2015 UHD BluRay HDR, the encodes -1080p 10-bit HDR CRF and CQP.
Small clips, around 800 frames, --preset slower --no-cutree --ipratio 1.0 --pbratio 1.0 (for all frames to have the same QP).
In a probing attempt with CRF 15 --qcomp 0.9 --qpstep 1, I’ve got (I-frames) Avg QP:12.18, (P-frames) Avg QP:12.12, (B-frames) Avg QP:12.09, so I’ve used QP 12 for the CQP mode.

I’ve encoded four clips. The clips are from regions where the 2160p BluRay source stream has 70 Mbps add 30 Mbps.



I’ve run VMAF PSNR SSIM MS-SSIM tests with vfam-master\Release\vmafossexec.exe and Model\vmaf_v0.6.1.pkl.

The first look at the results.
Code:
Low Bitrate 30 Mbps
			VMAF	PSNR	SSIM	MS-SSIM	SIZE MB	Less
Frames		CRF	99.590	54.036	0.99927	0.99911	106.876	
38594-777	CQP	99.518	53.898	0.99917	0.99900	95.441	
		CQP/CRF	0.999	0.997	0.99990	0.99988	89.30%	10.70%
							
Frames		CRF	96.853	54.304	0.99936	0.99923	67.881	
43505-793	CQP	96.873	54.006	0.99902	0.99892	50.705	
		CQP/CRF	1.000	0.995	0.99966	0.99969	74.70%	25.30%
							
High  Bitrate 70 Mbps
			VMAF	PSNR	SSIM	MS-SSIM	SIZE MB	Less						
Frames		CRF	99.742	53.453	0.99924	0.99907	113.454	
7066-772	CQP	99.724	53.325	0.99921	0.99900	95.441	
		CQP/CRF	1.000	0.998	0.99996	0.99994	84.12%	15.88%
							
Frames		CRF	99.541	53.804	0.99944	0.99934	56.662	
116425-786	CQP	99.547	54.159	0.99931	0.99925	52.005	
		CQP/CRF	1.000	1.007	0.99987	0.99990	91.78%	8.22%
The second look - the Bits allocated to the frames.
The first chart shows the information distributed over the stream frames 116425-(786) in Bits as generated by x265.
On the left of I-frame 500, which is actually 116924 (116424+500) - earlier in time, there is a region of higher CRF bitrate frames than CQP's. Part of the region is shown below the first chart.

What follows are a couple of comparison screens for frame 499
Screenshots are 8-bit color.
The coordinates of compared Pixels (marked with a black dot) and their YUV 10-bit code values are shown in boxes – down/right.
(The pixel with coordinates X:1031 Y:231 is in the sunlight reflecting spot on the left eye.
The pixel with coordinates X:900 Y:400 is under the nose on the right.)




Last edited by SpasV; 13th July 2019 at 20:38.
SpasV is offline   Reply With Quote
Old 13th July 2019, 22:19   #10  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 3,530
This kind of comparison is of limited value, the sample is too short and the sizes are too different. What are you trying to learn? What is the purpose of looking at the single pixels?
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 14th July 2019, 12:59   #11  |  Link
SpasV
Registered User
 
Join Date: Nov 2013
Location: Sofia, Bulgaria
Posts: 49
Quote:
Originally Posted by Asmodian View Post
This kind of comparison is of limited value, the sample is too short and the sizes are too different.
Yes, this is not a research paper.
Quote:
Originally Posted by Asmodian View Post
What are you trying to learn?
I essence, do I need AQ and Psy options when I aim high quality encode.
Although the "comparison is of limited value" my impression is I do not need these options. CQP seems to me perfect as long as I have chosen QP and it stays unchanged.
Quote:
Originally Posted by Asmodian View Post
What is the purpose of looking at the single pixels?
The purpose is for the reader to get impression about the closeness between images. Although the brain is capable of perceiving the whole image 1920x800 = 1,536,000 pixels in an instant it cannot distinguish pixels or even pixel area if they are very close.
As to "the single pixels", well let's try a look at the whole stream of 786 frames each 1,536,000 pixels.
Here is a chart of RMSE for the stream.


RMSE stands for Square Root of Mean Squared Error which is intuitively easier to understand.
The average RMSEs: CRF - 2.118, CQP - 2.054.
It is difficult to deal with this numbers without knowing the differences' distributions. Nevertheless I'll try to show some understandable interpretation assuming all pixels' code values are different and there are no more than number three between all pixels' code values.
In other words, I'm assuming differences are 1, 2, and 3.
Here are possible distributions of such differences.

It seems likely to me there are 2% pixels differ by 3, 33.8% - by 2 an 69.7% - by 1.
In fact, the distribution doesn't matter. Such 10-bit color frames are undistinguished (difference <4) when shown as 8-bit color because such differences would be 0 (zero).
SpasV is offline   Reply With Quote
Old Yesterday, 00:23   #12  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,874
Quote:
Originally Posted by SpasV View Post
I've decided to look at the compression efficiency considering the two Rate Control Modes - CRF and CQP...

The setup is simplified. The source - Mad Max: Fury Road 2015 UHD BluRay HDR, the encodes -1080p 10-bit HDR CRF and CQP.
I’ve run VMAF PSNR SSIM MS-SSIM tests with vfam-master\Release\vmafossexec.exe and Model\vmaf_v0.6.1.pkl.
None of those metrics have demonstrated good subjective correlation with HDR content. Specifically, VMAF doesn't even claim to produce accurate scores with HDR content.

Also, there are absolutely psychovisual optimizations that improve subjective quality while harming all objective metrics, even VMAF. This kind of comparison with HDR content really needs to be done subjectively at this point.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:39.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.