Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 14th January 2020, 18:36   #7341  |  Link
Barough
Registered User
 
Barough's Avatar
 
Join Date: Feb 2007
Location: Sweden
Posts: 483
x265 v3.2+34-8e6db24c1517 (32 & 64-bit 8/10/12bit Multilib Windows Binaries) (GCC 9.2.0)
Code:
https://bitbucket.org/multicoreware/x265/commits/branch/default
Barough is offline   Reply With Quote
Old 14th January 2020, 22:21   #7342  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by utack View Post
I am surprised how much difference "slower" makes compared to "slow"
It is probably expected due to all the extra bframes giving a quality gain and much more lookahead making different choices what to prioritize, but was not fully aware of it before testing.
I've kinda thought of slower as the first "real" x265 preset, where most of the stuff that makes HEVC better are in play. The quality gap between slower and placebo is smaller than that between slow and slower.

Except for lossless, where placebo has a pretty significant efficiency gain over even veryslow.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 15th January 2020, 06:22   #7343  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
I'd say "Slow" is a good compromise, but yeah, "Slower" is really where all the magic happens
Blue_MiSfit is offline   Reply With Quote
Old 15th January 2020, 14:35   #7344  |  Link
Magik Mark
Registered User
 
Join Date: Dec 2014
Posts: 666
I have 16 threads CPU. Is there a way to instruct x265 to use only 14?

Need the rest for other apps
__________________
Asus ProArt Z790 - 13th Gen Intel i9 - RTX 3080 - DDR5 64GB Predator - LG OLED C9 - Yamaha A3030 - Windows 11 x64 - PotPlayerr - Lav - MadVR
Magik Mark is offline   Reply With Quote
Old 15th January 2020, 15:36   #7345  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,810
Quote:
Originally Posted by Magik Mark View Post
I have 16 threads CPU. Is there a way to instruct x265 to use only 14?

Need the rest for other apps
I would just run x265.exe in IDLE priority. You may also use Process Explorer to change affinity to only 14 logical processors.

Last edited by Atak_Snajpera; 15th January 2020 at 15:38.
Atak_Snajpera is offline   Reply With Quote
Old 15th January 2020, 18:03   #7346  |  Link
vpupkind
Registered User
 
Join Date: Jul 2007
Posts: 62
Quote:
Originally Posted by benwaggoner View Post
It would be counterproductive in fact. IIRC, hdr10-opt adjusts delta qp for chroma based on luma levels to better match the characteristics of the PQ EOTF. By default x265 is more optimized for gamma. hdr10-opt would thus be bad for 709 or HLG. Dolby Vision, except in profiles that have a backwards compatible PQ base layer, doesn't even use Y'Cb'Cr, and in Profile 5 does some crazy dynamic range adjustments, so not helpful there.

There's probably fruitful research to be done for how to adapt chroma qp relative to luma based on each macroblock's luma levels in a psychovisual model.
ISO/IEC 23008-14 does that. I haven't had too much luck with their recommendations -- they destroyed textures in very bright scenes (think icy mountains).
vpupkind is offline   Reply With Quote
Old 15th January 2020, 23:55   #7347  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by Magik Mark View Post
I have 16 threads CPU. Is there a way to instruct x265 to use only 14?
--pools "14" should exactly limit it to 14 cores.

https://x265.readthedocs.io/en/defau...l#thread-pools
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th January 2020, 07:15   #7348  |  Link
Rousseau
Registered User
 
Join Date: Jun 2017
Posts: 8
Quote:
Originally Posted by Barough View Post
x265 v3.2+34-8e6db24c1517 (32 & 64-bit 8/10/12bit Multilib Windows Binaries) (GCC 9.2.0)
Code:
https://bitbucket.org/multicoreware/x265/commits/branch/default

multi-pass-opt-analysis / distortion causes a crash in this build.
Rousseau is offline   Reply With Quote
Old 16th January 2020, 11:21   #7349  |  Link
Magik Mark
Registered User
 
Join Date: Dec 2014
Posts: 666
Quote:
Originally Posted by benwaggoner View Post
--pools "14" should exactly limit it to 14 cores.



https://x265.readthedocs.io/en/defau...l#thread-pools


Thanks alot!


Sent from my iPhone using Tapatalk
__________________
Asus ProArt Z790 - 13th Gen Intel i9 - RTX 3080 - DDR5 64GB Predator - LG OLED C9 - Yamaha A3030 - Windows 11 x64 - PotPlayerr - Lav - MadVR
Magik Mark is offline   Reply With Quote
Old 16th January 2020, 23:54   #7350  |  Link
jlpsvk
Registered User
 
Join Date: Dec 2014
Posts: 240
Is there any way to fully utitlise x265 with single process (without distrubuted encoding) on Ryzen 9 3950X (16c/32t) without quality loss?
__________________
AMD Ryzen 9 5950X, 32GB DDR4-3200 CL16, RTX 3060, 2TB NVMe PCIE4.0, NAS with 8x16TB HDD

Last edited by jlpsvk; 17th January 2020 at 00:06.
jlpsvk is offline   Reply With Quote
Old 17th January 2020, 00:59   #7351  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by jlpsvk View Post
Is there any way to fully utitlise x265 with single process (without distrubuted encoding) on Ryzen 9 3950X (16c/32t) without quality loss?
What resolution are you encoding to?

--preset slower --pmode --pme should saturate most systems. Although it's often faster to leave out --pmode and --pme if the cores/pixels ratio isn't that high. The goal is to maximize fps, not CPU load. PMode is typically a lot more useful than PME, which I've only seen to be helpful at SD resolutions and below.

PMode and PME don't have any negative quality impact, and pmode can actually help a trifle, potentially.

Doing 2160p on an 18/36 system encoding is ~30% faster without pmode than with it, even though CPU load is about 50% versus ~90%.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 17th January 2020, 05:54   #7352  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,729
--pmode can have a negative impact in CRF mode, so be warned. I just experienced that.. it produced a lower bitrate than without --pmode at the same settings at CRF 18, and I happened to check a scene with a heavily noisy picture. The dancing of the noise was much uglier than in a test encode without --pmode.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 17th January 2020, 12:21   #7353  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,810
Quote:
Originally Posted by jlpsvk View Post
Is there any way to fully utitlise x265 with single process (without distrubuted encoding) on Ryzen 9 3950X (16c/32t) without quality loss?
--ctu 32
Atak_Snajpera is offline   Reply With Quote
Old 17th January 2020, 19:48   #7354  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by Atak_Snajpera View Post
--ctu 32
Right! There are other ways to increase parallelism, at the risk of some hits to compression efficiency. However, with a whole lot of cores, quality @ perf might have a different optimal configuration, like using more frame thread and a higher RD level. Other parameters in the same bucket:

-F >4 (this had quality hits in older versions; not sure of the impact in 3.x)
--lookahead slices >8 (It seems like this shouldn't have a quality hit, but it is 1 in the slowest presets)

Another thing that might help (haven't tested) without impacting quality is --selective-sao 2. SAO itself limits parallelism somewhat per https://x265.readthedocs.io/en/default/threading.html. So turning it off for B-frames could help. Although it's also possible that the latency hit is fixed whenever SAO is on at all, so --no-sao might help, although with efficiency hits, particularly at lower bitrates. Worth testing.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 17th January 2020, 19:49   #7355  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by Boulder View Post
--pmode can have a negative impact in CRF mode, so be warned. I just experienced that.. it produced a lower bitrate than without --pmode at the same settings at CRF 18, and I happened to check a scene with a heavily noisy picture. The dancing of the noise was much uglier than in a test encode without --pmode.
Oh, good tip! That isn't supposed to happen in theory. You should provide a bug report and repro to MultiCoreWare so they can look at this deeper.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 17th January 2020, 20:51   #7356  |  Link
jlpsvk
Registered User
 
Join Date: Dec 2014
Posts: 240
Quote:
Originally Posted by benwaggoner View Post
What resolution are you encoding to?

--preset slower --pmode --pme should saturate most systems. Although it's often faster to leave out --pmode and --pme if the cores/pixels ratio isn't that high. The goal is to maximize fps, not CPU load. PMode is typically a lot more useful than PME, which I've only seen to be helpful at SD resolutions and below.

PMode and PME don't have any negative quality impact, and pmode can actually help a trifle, potentially.

Doing 2160p on an 18/36 system encoding is ~30% faster without pmode than with it, even though CPU load is about 50% versus ~90%.
I am encoding 2160p, with CRF16.
__________________
AMD Ryzen 9 5950X, 32GB DDR4-3200 CL16, RTX 3060, 2TB NVMe PCIE4.0, NAS with 8x16TB HDD
jlpsvk is offline   Reply With Quote
Old 17th January 2020, 21:07   #7357  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,729
Quote:
Originally Posted by benwaggoner View Post
--lookahead slices >8 (It seems like this shouldn't have a quality hit, but it is 1 in the slowest presets)
This probably affects frame type decision, at least that's what I observed when using four lookahead slices compared to just one.


Quote:
Quote:
--pmode can have a negative impact in CRF mode, so be warned. I just experienced that.. it produced a lower bitrate than without --pmode at the same settings at CRF 18, and I happened to check a scene with a heavily noisy picture. The dancing of the noise was much uglier than in a test encode without --pmode.
Oh, good tip! That isn't supposed to happen in theory. You should provide a bug report and repro to MultiCoreWare so they can look at this deeper.
It's probably due to the encoder doing deeper analysis as some performance related early skips are avoided. Nevertheless, I got a clip of the scene for my testing toolkit and I can definitely create a ticket for the issue. Maybe there is something there to fix.
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 18th January 2020, 09:07   #7358  |  Link
jlpsvk
Registered User
 
Join Date: Dec 2014
Posts: 240
i tested, and without pme and pmode, it's faster.
__________________
AMD Ryzen 9 5950X, 32GB DDR4-3200 CL16, RTX 3060, 2TB NVMe PCIE4.0, NAS with 8x16TB HDD
jlpsvk is offline   Reply With Quote
Old 18th January 2020, 09:35   #7359  |  Link
microchip8
ffx264/ffhevc author
 
microchip8's Avatar
 
Join Date: May 2007
Location: /dev/video0
Posts: 1,844
Quote:
Originally Posted by benwaggoner View Post
Right! There are other ways to increase parallelism, at the risk of some hits to compression efficiency. However, with a whole lot of cores, quality @ perf might have a different optimal configuration, like using more frame thread and a higher RD level. Other parameters in the same bucket:

-F >4 (this had quality hits in older versions; not sure of the impact in 3.x)
--lookahead slices >8 (It seems like this shouldn't have a quality hit, but it is 1 in the slowest presets)

Another thing that might help (haven't tested) without impacting quality is --selective-sao 2. SAO itself limits parallelism somewhat per https://x265.readthedocs.io/en/default/threading.html. So turning it off for B-frames could help. Although it's also possible that the latency hit is fixed whenever SAO is on at all, so --no-sao might help, although with efficiency hits, particularly at lower bitrates. Worth testing.
I have tested a few samples with SAO and selective-sao set to 2. I find the result looking too soft. Without it, there's more detail present. But if others like it, who am I to stop them?
__________________
ffx264 || ffhevc || ffxvid || microenc
microchip8 is offline   Reply With Quote
Old 19th January 2020, 19:56   #7360  |  Link
jlpsvk
Registered User
 
Join Date: Dec 2014
Posts: 240
Quote:
Originally Posted by froggy1 View Post
I have tested a few samples with SAO and selective-sao set to 2. I find the result looking too soft. Without it, there's more detail present. But if others like it, who am I to stop them?
exactly... my 2160p HDR10 preset currently...any suggestions?

Code:
--preset slow --profile main10 --level-idc 5.1 --output-depth 10 --crf 16 --ctu 64 --aq-mode 4 --merange 57 --amp --no-rskip --qg-size 8 --vbv-bufsize 160000 --vbv-maxrate 160000 --bframes 8
--rc-lookahead 48 --gop-lookahead 30 --hdr10 --hdr10-opt --repeat-headers --no-info --no-deblock --no-sao --selective-sao 0 --allow-non-conformance --no-strong-intra-smoothing --high-tier --chromaloc 2
--fades --hme --hme-search umh,umh,star
__________________
AMD Ryzen 9 5950X, 32GB DDR4-3200 CL16, RTX 3060, 2TB NVMe PCIE4.0, NAS with 8x16TB HDD
jlpsvk is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:18.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.