Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 28th January 2020, 20:15   #21  |  Link
blublub
Registered User
 
Join Date: Jan 2015
Posts: 118
Quote:
Originally Posted by MeteorRain View Post
For high core count cpu I'd reduce frame threads to 1 and pools to 12 and do chunked encoding (half-half) or parallel encoding (2 at a time).

I believe less threads = less waste on threading for x265, so I always use less threads and run more processes.

Besides, I already have the software infrastructure to do chunked encoding, and mux them later. Encoded GOPs can be joined later, so chunked encoding is basically zero cost to me.
higher frame-threads will degrade quality, however x265 default is 6 for HCC CPUs so anything below won't hurt quality significantly.

From my understanding the parameter "pools" has no impact on quality. It is by default set to max number of available Hyperthreading cores
blublub is offline   Reply With Quote
Old 29th January 2020, 17:06   #22  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,738
Quote:
Originally Posted by blublub View Post
higher frame-threads will degrade quality, however x265 default is 6 for HCC CPUs so anything below won't hurt quality significantly.

From my understanding the parameter "pools" has no impact on quality. It is by default set to max number of available Hyperthreading cores
Well, setting --pools to limit to a single NUMA socket will reduce the threads available, and so may reduce frame-threads. And generally doesn't slow things down that much; the overhead of using multiple cores seems to reduce the speed benefit of using them pretty significantly in my testing. Even at 8K, using a second core on a 2x18/36 Xeon system is maybe 20% faster than a second, and I generally just set up different encodes for each socket (ala --pools "+.-" and "-,+"
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 29th January 2020, 17:16   #23  |  Link
blublub
Registered User
 
Join Date: Jan 2015
Posts: 118
Hi
I only have 1 Socket iny my TR 3960x.
I currently don't know if x265 detects it as 1 or 2 Numa nodes.
However increasing pools from default/auto 48 to 64 does help with CPU utilization and speed - although it's not stuck at 100%, but at least 85 to 95 which is better than 75% average with the auto setting.

In the end it al comes down to:
Scaling isn't optimal above 12c CPUs and
If one just encodes one job a 16c and 32t is the optimal CPU / bang for the buck here with a 3950x as the higher cost of a 24c CPU isn't worth it at the moment

A 24c CPU makes sense if one encodes more than one job in parallel or uses distributed encoding
blublub is offline   Reply With Quote
Old 29th January 2020, 18:01   #24  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,803
Quote:
I currently don't know if x265 detects it as 1 or 2 Numa nodes.
Most likely your 3960x is seen as single numa node (Run EncodingServer.exe and look for line [SYSTEM] NUMA Nodes = 1). Only 3990x will be divided into two virtual numa nodes in OS.
Atak_Snajpera is offline   Reply With Quote
Old 29th January 2020, 19:25   #25  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 894
Quote:
Originally Posted by blublub View Post
higher frame-threads will degrade quality, however x265 default is 6 for HCC CPUs so anything below won't hurt quality significantly.

From my understanding the parameter "pools" has no impact on quality. It is by default set to max number of available Hyperthreading cores
Just so you know that I wasn't talking about loss of quality, but a waste of computing resource. Single threaded application works more efficient than multi threaded application in terms of work per computing resource. (i.e. 2x single threaded app works more efficient than 1x 2 threads app.) So I like balancing the threads count and processes count to not introducing too much trouble while not introducing too much waste.

In my personal use case I found it a good balance to run 4-6 threads for x265 and another 2-4 threads for AviSynth.
__________________
Projects
x265 - Yuuki-Asuna-mod Download / GitHub
TS - ADTS AAC Splitter | LATM AAC Splitter | BS4K-ASS
Neo AviSynth+ filters - F3KDB | FFT3D | DFTTest | MiniDeen | Temporal Median
MeteorRain is offline   Reply With Quote
Old 29th January 2020, 19:33   #26  |  Link
MeteorRain
結城有紀
 
Join Date: Dec 2003
Location: NJ; OR; Shanghai
Posts: 894
Also don't forget that, half of the CPU threads are SMT. So if you wisely use half of the CPU threads (i.e. 16 threads, each per 16 physical cores) you'll see 50% CPU utilization but underneath you are already using 80% of your CPU capacity. Pushing it to 100% CPU utilization will get you at most 25% more speed from 50% utilization before any extra threading loss.
__________________
Projects
x265 - Yuuki-Asuna-mod Download / GitHub
TS - ADTS AAC Splitter | LATM AAC Splitter | BS4K-ASS
Neo AviSynth+ filters - F3KDB | FFT3D | DFTTest | MiniDeen | Temporal Median
MeteorRain is offline   Reply With Quote
Old 2nd February 2020, 17:02   #27  |  Link
sonyzz
Registered User
 
Join Date: Apr 2019
Posts: 4
current settings are more like that for 1080p video on 4c 8t i7 cpu:
wpp / ctu=32 / min-cu-size=8 / max-tu-size=32 / tu-intra-depth=2 / tu-inter-depth=2 / me=3 / subme=5 / merange=57 / rect / no-amp / max-merge=3 / temporal-mvp / no-early-skip / rskip / rdpenalty=0 / no-tskip / no-tskip-fast / strong-intra-smoothing / no-lossless / no-cu-lossless / no-constrained-intra / no-fast-intra / open-gop / no-temporal-layers / interlace=0 / keyint=250 / min-keyint=23 / scenecut=40 / rc-lookahead=30 / lookahead-slices=4 / bframes=8 / bframe-bias=0 / b-adapt=2 / ref=4 / limit-refs=2 / limit-modes / weightp / weightb / aq-mode=3 / qg-size=32 / aq-strength=0.80 / cbqpoffs=0 / crqpoffs=0 / rd=4 / psy-rd=0.70 / rdoq-level=2 / psy-rdoq=1.00 / log2-max-poc-lsb=8 / limit-tu=0 / no-rd-refine / signhide / deblock=1:1 / no-sao / no-sao-non-deblock / b-pyramid / cutree / no-intra-refresh / rc=crf / crf=22.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ipratio=1.40 / pbratio=1.30

Last edited by sonyzz; 2nd February 2020 at 17:05.
sonyzz is offline   Reply With Quote
Reply

Tags
3900x, 3950x, hevc, ryzen, x265

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:20.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.