Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th January 2020, 00:08   #21  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by froggy1 View Post
What's the point then of having a CPU with enough/many threads and then not being able to utilize it fully during encoding for the sake of quality improvement that is so minor you won't notice it unless you use a magnifying glass to look at still pictures?
-F 1 --pmode will saturate 12+ cores at 1080p. The quality benefit from turning off frame threading is a lot less than it was a few years ago, however.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th January 2020, 00:12   #22  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Forteen88 View Post
True, 10 bits is the way to go with x265. Weird, I thought that 10 bits was the default --output-depth value in x265, but I just looked it up, and saw that it's not (at least not in the x265-build I use), default is 8!
With an 8-bit source, there's no real reason to encode in 10-bit in x265. Encoding 8-bit will be significantly faster and the output will be more compatible with older mobile devices. Encoding 8-bit in 10-bit provided a measurable benefit with x264, but HEVC is designed to fix the issues in 8-bit that made that valuable with H.264.

You might want to add: --frame-threads 1
in the commandline for better picture-quality, but if you got many cores on your CPU, then it will take longer time to encode (since x265 don't use all cores then).

Quote:
UPDATE: froggy1, OK, --frame-threads 2 might be a better value to set, for more CPU-core usage.
Yeah, 2 is pretty safe.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th January 2020, 00:19   #23  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
I think there is a lot of boiling the ocean here with changing deblock, sao, bit depth, etcetera. Those really don't have nearly the same impact in x265 3.1+ compared to a few years ago.

I'd start with just --preset slower --crf 18 and see what comes out. Given the weakness of the original encode (only 1 ref frame, no b-frames, probably a HW encoder) I'd expect that x265 could deliver the same quality in less than half the bitrate. Find some interesting 5 min chunks and do test encodes of those. The big question is what CRF offers sufficient transparency to the source. Which certainly already has visible compression artifacts. Particularly detail loss, maybe some blocking as well. 5 Mbps for 1080p60 with that limited AVC feature set is WAY low. I'd want more like 80 with those settings to have a hope at reasonable source quality.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th January 2020, 13:04   #24  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Quote:
Originally Posted by benwaggoner View Post
-F 1 --pmode will saturate 12+ cores at 1080p. The quality benefit from turning off frame threading is a lot less than it was a few years ago, however.
So what is the recommended way for higher video-quality when doing ~6 core encoding? -F 1 --pmode or just -F 2?

Last edited by Forteen88; 4th January 2020 at 13:06.
Forteen88 is offline   Reply With Quote
Old 4th January 2020, 15:49   #25  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,717
Quote:
Originally Posted by benwaggoner View Post
-F 1 --pmode will saturate 12+ cores at 1080p.
That doesn't happen based on my experiences. I'm currently encoding a 1920x956 stream with --pmode -F 4 on my Ryzen 3900X (12c/24t), and the CPU usage is ~90% and that also includes decoding and denoising in the Vapoursynth script. Using -F 1 would drop the usage quite a bit lower.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 6th January 2020, 05:02   #26  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Boulder View Post
That doesn't happen based on my experiences. I'm currently encoding a 1920x956 stream with --pmode -F 4 on my Ryzen 3900X (12c/24t), and the CPU usage is ~90% and that also includes decoding and denoising in the Vapoursynth script. Using -F 1 would drop the usage quite a bit lower.
I should have said 12 threads, not 12 cores.

And it depends on other parameters that also impact parallelization. Is -F 4 really materially faster than -F 3? There is threading overhead, so fps of encoding is more important than saturation. Doing processing at the same time to encoding can serialize things and also negatively impact L3 caching.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th January 2020, 05:03   #27  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Forteen88 View Post
So what is the recommended way for higher video-quality when doing ~6 core encoding? -F 1 --pmode or just -F 2?
What resolution? At higher presets at 1080p, just -F 1 is probably pretty close to optimal fps.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th January 2020, 08:37   #28  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Quote:
Originally Posted by benwaggoner View Post
What resolution? At higher presets at 1080p, just -F 1 is probably pretty close to optimal fps.
At 1080p, with --preset slower. Sometimes I do 720p too, but not at all as often.
Forteen88 is offline   Reply With Quote
Old 6th January 2020, 22:34   #29  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Forteen88 View Post
At 1080p, with --preset slower. Sometimes I do 720p too, but not at all as often.
What fps and utilization are you seeing?

I again caution people to focus more on fps than on utilization. The fastest encoding options for a given quality may be <<100% utilization.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 7th January 2020, 05:30   #30  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,717
I did some tests, and in my case it looks like this:

Code:
1080p, ctu 32, F 1 - 1.55 fps - 15209.92 kbps
1080p, ctu 32, F 2 - 1.98 fps - 15160.69 kbps
1080p, ctu 32, F 3 - 2.00 fps - 15216.68 kbps
1080p, ctu 32, F 4 - 2.00 fps - 15218.20 kbps

1080p, ctu 64, F 1 - 0.96 fps - 15183.31 kbps
1080p, ctu 64, F 2 - 1.41 fps - 15195.30 kbps
1080p, ctu 64, F 3 - 1.58 fps - 15194.53 kbps
1080p, ctu 64, F 4 - 1.68 fps - 15176.67 kbps

720p, ctu 32, F 1 - 2.95 fps - 6035.36 kbps
720p, ctu 32, F 2 - 3.98 fps - 6038.02 kbps
720p, ctu 32, F 3 - 4.23 fps - 5942.17 kbps
720p, ctu 32, F 4 - 4.29 fps - 6038.01 kbps
I do use --limit-tu 0 and --limit-refs 1, otherwise pretty much close to --preset slower. The Vapoursynth part utilizes ~20-25% of CPU.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 7th January 2020, 21:59   #31  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Quote:
Originally Posted by benwaggoner View Post
What fps and utilization are you seeing?

I again caution people to focus more on fps than on utilization. The fastest encoding options for a given quality may be <<100% utilization.
I just want to know in general what I should use of those options in x265 using a 6-core CPU (I use TemporalDegrain2(grainLevel=false) avisynth script quite often with that), I never actually used pmode.
I care about power-consumption when doing encodes, but I still want very high quality encodes.

froggy1. Yeah, I always set --limit-tu 0 when I set --preset slower.

EDIT: OK, thanks Waggoner.

Last edited by Forteen88; 9th January 2020 at 07:20. Reason: thanking
Forteen88 is offline   Reply With Quote
Old 7th January 2020, 22:19   #32  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Forteen88 View Post
I just what to know in general what I should use of those options in x265 using a 6-core CPU (I use TemporalDegrain2-avisynth script quite often with that), I never actually used pmode.
I care about power-consumption when doing encodes, but I still want very high quality encodes.

froggy1. Yeah, I always set --limit-tu 0 when I set --preset slower.
--selective-sao 2 is a decent speedup without material quality impact. I am sure it'll be on by default in future presets.

--limit-tu 4 is a better quality/speed tradeoff in my experience.

If you are doing VBV limited and CRF, you can get the same quality at a slightly higher ABR with --rd 4 --dynamic-rd 3. That'll give you full "slower" efficiency at the VBV peak while being faster and a little less efficient when quality isn't limited by VBV.

--pmode is definitely power-inefficient, since it is doing things in parallel that will get thrown away. And often it'll increase utilization AND reduce fps if there aren't enough free cores available. Same with --pme.

Sent from my SM-T837V using Tapatalk
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th January 2020, 10:00   #33  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 322
Quote:
Originally Posted by benwaggoner View Post
With an 8-bit source, there's no real reason to encode in 10-bit in x265. Encoding 8-bit will be significantly faster and the output will be more compatible with older mobile devices. Encoding 8-bit in 10-bit provided a measurable benefit with x264, but HEVC is designed to fix the issues in 8-bit that made that valuable with H.264.
Do you have any link for any research done on this? I get noticeable less banding when using main10 on 8bit content with x265. And those legacy devices isnt really an argument, cause if you need to it to be compatible with those devices, then you simply use 8bit... And if not (i.e if its not the target device for playback), why even think about those? 10bit decoders are in majority.

Last edited by excellentswordfight; 8th January 2020 at 10:23.
excellentswordfight is offline   Reply With Quote
Old 8th January 2020, 17:05   #34  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by excellentswordfight View Post
Do you have any link for any research done on this? I get noticeable less banding when using main10 on 8bit content with x265. And those legacy devices isnt really an argument, cause if you need to it to be compatible with those devices, then you simply use 8bit... And if not (i.e if its not the target device for playback), why even think about those? 10bit decoders are in majority.
How are you displaying your 10-bit video? Since truly 10 bit panels are pretty rare, encoding in 10-bit could well trigger some dithering mode that 8-bit doesn't go through. That would be a difference in postprocessing instead of encoding, and would be quite dependent on the playback system. Other times 10-bit can look worse due to LSB truncation or other bad conversion.

If you're only playing back on devices that do better with 10-bit and don't care about the speed difference, of course using 10-bit for 8-bit can make perfect sense. My focus is on how to make content look good across thousands of different devices (thanks, Android), including future ones.

Sent from my SM-T837V using Tapatalk
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th January 2020, 21:47   #35  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Quote:
Originally Posted by benwaggoner View Post
...
--pmode is definitely power-inefficient, since it is doing things in parallel that will get thrown away. And often it'll increase utilization AND reduce fps if there aren't enough free cores available. Same with --pme.
Thanks. Is --frame-threads 1 more power-efficient compared to --frame-threads 2?
Forteen88 is offline   Reply With Quote
Old 9th January 2020, 16:40   #36  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 322
Quote:
Originally Posted by benwaggoner View Post
How are you displaying your 10-bit video? Since truly 10 bit panels are pretty rare, encoding in 10-bit could well trigger some dithering mode that 8-bit doesn't go through. That would be a difference in postprocessing instead of encoding, and would be quite dependent on the playback system. Other times 10-bit can look worse due to LSB truncation or other bad conversion.
I'm aware that 10bit panels are rare, and that all file based SDR media I watch is converted to 8bit RGB eitherway. Still, from the encoding i've done with x265, using main10 has given better results.

So here is a grab from ToS, one from an 8bit encode and one 10bit at 3Mbps, decoded with avisynth (FFMS2 source and converted to 8bit RGB with ConvertToRGB24)
8bit
https://ibb.co/7Qbxxx5
10bit
https://ibb.co/ChLFrTB
source
https://ibb.co/zxC48x6

Quote:
Originally Posted by benwaggoner View Post
If you're only playing back on devices that do better with 10-bit and don't care about the speed difference, of course using 10-bit for 8-bit can make perfect sense. My focus is on how to make content look good across thousands of different devices (thanks, Android), including future ones.
And ofc it makes sense to have greater compatibility when encoding for thousands of different devices... But I assume from TS request that he isnt re-encoding for amazon

Last edited by excellentswordfight; 9th January 2020 at 16:43.
excellentswordfight is offline   Reply With Quote
Old 9th January 2020, 19:31   #37  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by Forteen88 View Post
Thanks. Is --frame-threads 1 more power-efficient compared to --frame-threads 2?
I'd think so. It's a somewhat complex and empirical question, because a computer uses X power just doing nothing. More threading is generally more overhead and thus MIPS/watt goes down. But that'd delta watts from the minimum power state. So the best fps/watt is going to depend on specifics of settings, content, and system.

The answer will also be different between running a single encode and multiple encodes at the same time. Reducing threading with multiple instances is more beneficial.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 9th January 2020, 19:37   #38  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by excellentswordfight View Post
I'm aware that 10bit panels are rare, and that all file based SDR media I watch is converted to 8bit RGB eitherway. Still, from the encoding i've done with x265, using main10 has given better results.

So here is a grab from ToS, one from an 8bit encode and one 10bit at 3Mbps, decoded with avisynth (FFMS2 source and converted to 8bit RGB with ConvertToRGB24)
8bit
https://ibb.co/7Qbxxx5
10bit
https://ibb.co/ChLFrTB
source
https://ibb.co/zxC48x6
Yeah, I do see a little more detail here in the 10-bit. Of course, ToS is available in 16-bit RGB; which version did you use? Even an 8-bit RGB has more information than an 8-bit limited range YUV, as it is 4:4:4 0-255 going to 4:2:0 16-235. I was really talking about encoding from 8-bit limited range YUV being fine with 8-bit encoding.

Is the difference visible in motion?
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 10th January 2020, 09:56   #39  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 322
Quote:
Originally Posted by benwaggoner View Post
Yeah, I do see a little more detail here in the 10-bit. Of course, ToS is available in 16-bit RGB; which version did you use? Even an 8-bit RGB has more information than an 8-bit limited range YUV, as it is 4:4:4 0-255 going to 4:2:0 16-235. I was really talking about encoding from 8-bit limited range YUV being fine with 8-bit encoding.

Is the difference visible in motion?
The source is 8bit limited, I have encoded an 1080p bluray compatable version from the 4k y4m version that I use.
excellentswordfight is offline   Reply With Quote
Old 13th January 2020, 21:30   #40  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by excellentswordfight View Post
The source is 8bit limited, I have encoded an 1080p bluray compatable version from the 4k y4m version that I use.
Interesting and intriguing! That's a well designed test you did. I'll try to run some of my own with other content and see if I can replicate.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Tags
x265

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:44.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.