Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th February 2017, 22:52   #4781  |  Link
pingfr
Registered User
 
Join Date: May 2015
Posts: 185
Quote:
Originally Posted by aymanalz View Post
Yes, but with his i7-6700, I think he should definitely be getting faster encodes.

@pingfr : Are you sure there is nothing bottlenecking your system? Have you checked out your CPU usage during encoding? Is it always above 90%?

Reducing the maximum CU size to 32 can give a big speed boost. For resolutions less than 1080p, the difference in quality may not be significant. Same with merange.

Edit: I noticed that you are already using CTU 32. I'm pretty sure something is off, if you are only getting the speeds you say. Is hyperthreading turned on, are you using the appropriate values for threads/pools etc?
Everything seems normal to me: http://imgur.com/a/fYSy4

Also note this is a regular non-K, non unlocked i7-6700.

According to ARK, it's base frequency is 3.4GHz with a alleged Turbo Boost to 4.00GHz but for some reason throttles to 3.7GHz while under 100% load, it can however reach 4.0GHz on a single core.

https://ark.intel.com/products/88196...up-to-4_00-GHz

Last edited by pingfr; 19th February 2017 at 22:55.
pingfr is offline   Reply With Quote
Old 20th February 2017, 02:39   #4782  |  Link
Dclose
Registered User
 
Join Date: Aug 2014
Posts: 50
Quote:
Originally Posted by Leo 69 View Post
I never saw that switching on rectangular and asymmetric partitions would improve visual quality. I can't tell the difference whether they're on or off on any encodes.
I have. I never turn rectangle off after doing some tests of it. Sometimes I turn asymmetric off to increase speed since asymm is a worse quality/speed ratio than rectangle, but when I'm going for better quality I turn it on since it does help tighten up the picture a little more.

I easily noticed the difference again the other day when doing some tests. I was doing quality comparisons by watching the video play from four feet away on a 40" screen. Some people may think that's too close, while some people around here do their comparisons by zooming in on screenshots and analyzing pixels.

The biggest speed boost is probably Early Skip. The quality difference is generally quite obvious to me, but the speed increase is even more obvious. I'd say the higher the resolution (and bitrate), the less Early Skip hurts quality.

The more bitrate used, the less the quality settings matter. I'm usually dropping bitrate below what most people would think is acceptable, so the different quality settings tend to show differences more, so I rarely use speed boosts like Early Skip, and things like Max Merge can have a very noticeable difference on video quality.

Last edited by Dclose; 2nd March 2017 at 10:23.
Dclose is offline   Reply With Quote
Old 20th February 2017, 04:45   #4783  |  Link
pradeeprama
Registered User
 
Join Date: Sep 2015
Posts: 48
Quote:
Originally Posted by pingfr View Post
Could anyone here tell me which of the parameters included in the --preset veryslow compared to the --preset slow is the "culprit" dropping encoding speeds from 4fps to 0.30fps?

The idea is to retain most of the "quality improvements" from the veryslow preset while retaining the "merely acceptable" encoding speeds yield by the slow preset. There has to be some kind of middle grounds between both presets.

Thanks.
I suspect the top culprits for why fps drops so significantly with veryslow, when compared to slow preset is because of increased rd-level, increased subme, +1 reference frames, and enabling more TU search (--tu-inter 3 --tu-intra 3 in veryslow instead of --tu-inter 1 --tu-intra 1 that slow has).

I would recommend increasing --limit-ref and --limit-modes, enabling --limit-tu 4 and trying if that helps contain the drop in fps. There is another feature called --dynamic-rd that dynamically increases rd-level to 5 when you start with 4, but this works only when VBV parameters are used and they clip the quality due to excessive bits.
pradeeprama is offline   Reply With Quote
Old 20th February 2017, 07:13   #4784  |  Link
aymanalz
Registered User
 
Join Date: May 2015
Posts: 68
@pingfr : Maybe I'm missing it, but I cannot see a value for CPU load in the CPU-Z screenshot you provided. I only see info about the CPU itself. Open task manager (Win+shift+esc) to see how much of your CPU is being used. If the x265 executable is not using more than 90% all the time, you can be sure that there is a bottleneck somewhere.

Your rd-level is 6; that is definitely a major factor in speed. Reducing it to 4 can improve speed considerably. (Will quality loss be tolerable, I cannot say - you will have to test it out.) If I'm not mistaken, limit ref depth and CU only works for rd-level below 5. Those options give significant speed boost with little quality loss, but they are automatically disabled because your rd-level is too high.
aymanalz is offline   Reply With Quote
Old 20th February 2017, 10:53   #4785  |  Link
Midzuki
Unavailable
 
Midzuki's Avatar
 
Join Date: Mar 2009
Location: offline
Posts: 1,480
x265 2.3+8-cfaff341e350

Code:
SAO: avoid negative indexes in 'x265_lambda2_tab' table
http://www.mediafire.com/file/0857vp...faff341e350.7z
Midzuki is offline   Reply With Quote
Old 20th February 2017, 20:09   #4786  |  Link
ShogoXT
Registered User
 
Join Date: Dec 2011
Posts: 95
I think I am going to be getting one of the new 8 core Ryzen CPUs.

From the rumors ive been hearing about, its main weakness will be AVX.

Would it be worthwhile to benchmark and compare its speed on different instructions vs intel CPUs with the use of -asm using x265?

Thanks
ShogoXT is offline   Reply With Quote
Old 21st February 2017, 16:10   #4787  |  Link
Midzuki
Unavailable
 
Midzuki's Avatar
 
Join Date: Mar 2009
Location: offline
Posts: 1,480
According to commit 820f4327ddac... ,

Code:
CLI: Remove redundant cli option 'capture-csp'
OK, but now, ¿when MCW is going to finally correct the information shown below?

Code:
--bframes <integer>  Maximum number of consecutive b-frames (now it only enables B GOP structure) Default 4
Midzuki is offline   Reply With Quote
Old 21st February 2017, 20:54   #4788  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
Quote:
Originally Posted by Midzuki View Post
[...] ¿when MCW is going to finally correct the information shown below?

Code:
--bframes <integer>  Maximum number of consecutive b-frames (now it only enables B GOP structure) Default 4
This change will be OK?
Code:
diff -r 820f4327ddac source/x265cli.h
--- a/source/x265cli.h	Mon Feb 20 17:18:53 2017 +0530
+++ b/source/x265cli.h	Tue Feb 21 20:48:09 2017 +0100
@@ -391,7 +391,7 @@
     H0("   --rc-lookahead <integer>      Number of frames for frame-type lookahead (determines encoder latency) Default %d\n", param->lookaheadDepth);
     H1("   --lookahead-slices <0..16>    Number of slices to use per lookahead cost estimate. Default %d\n", param->lookaheadSlices);
     H0("   --lookahead-threads <integer> Number of threads to be dedicated to perform lookahead only. Default %d\n", param->lookaheadThreads);
-    H0("   --bframes <integer>           Maximum number of consecutive b-frames (now it only enables B GOP structure) Default %d\n", param->bframes);
+    H0("-b/--bframes <0..16>             Maximum number of consecutive b-frames. Default %d\n", param->bframes);
     H1("   --bframe-bias <integer>       Bias towards B frame decisions. Default %d\n", param->bFrameBias);
     H0("   --b-adapt <0..2>              0 - none, 1 - fast, 2 - full (trellis) adaptive B frame scheduling. Default %d\n", param->bFrameAdaptive);
     H0("   --[no-]b-pyramid              Use B-frames as references. Default %s\n", OPT(param->bBPyramid));
Ma is offline   Reply With Quote
Old 21st February 2017, 23:25   #4789  |  Link
Midzuki
Unavailable
 
Midzuki's Avatar
 
Join Date: Mar 2009
Location: offline
Posts: 1,480
Quote:
Originally Posted by Ma View Post
This change will be OK?
Code:
diff -r 820f4327ddac source/x265cli.h
--- a/source/x265cli.h	Mon Feb 20 17:18:53 2017 +0530
+++ b/source/x265cli.h	Tue Feb 21 20:48:09 2017 +0100
@@ -391,7 +391,7 @@
     H0("   --rc-lookahead <integer>      Number of frames for frame-type lookahead (determines encoder latency) Default %d\n", param->lookaheadDepth);
     H1("   --lookahead-slices <0..16>    Number of slices to use per lookahead cost estimate. Default %d\n", param->lookaheadSlices);
     H0("   --lookahead-threads <integer> Number of threads to be dedicated to perform lookahead only. Default %d\n", param->lookaheadThreads);
-    H0("   --bframes <integer>           Maximum number of consecutive b-frames (now it only enables B GOP structure) Default %d\n", param->bframes);
+    H0("-b/--bframes <0..16>             Maximum number of consecutive b-frames. Default %d\n", param->bframes);
     H1("   --bframe-bias <integer>       Bias towards B frame decisions. Default %d\n", param->bFrameBias);
     H0("   --b-adapt <0..2>              0 - none, 1 - fast, 2 - full (trellis) adaptive B frame scheduling. Default %d\n", param->bFrameAdaptive);
     H0("   --[no-]b-pyramid              Use B-frames as references. Default %s\n", OPT(param->bBPyramid));
Thanks for asking.
Well, that's almost good. THIS is the right way to do it.
Code:
H0("-b/--bframes <0..16>             Maximum number of consecutive B-frames. Default %d\n", param->bframes);

H1("   --bframe-bias <integer>       Bias towards B-frame decisions. Default %d\n", param->bFrameBias);

H0("   --b-adapt <0..2>              0 - none, 1 - fast, 2 - full (trellis) adaptive B-frame scheduling. Default %d\n", param->bFrameAdaptive);

H0("   --[no-]b-pyramid              Use B-frames as references. Default %s\n", OPT(param->bBPyramid));
Because consistency matters.

Last edited by Midzuki; 21st February 2017 at 23:28. Reason: typo
Midzuki is offline   Reply With Quote
Old 22nd February 2017, 07:01   #4790  |  Link
youli
Registered User
 
youli's Avatar
 
Join Date: Mar 2015
Location: Ukraine
Posts: 23
--aq-motion and --dynamic-rd

Options test: --aq-motion and --dynamic-rd.
Bitrate decrease about 3,5%.
Encoded 178944 frames in 107297.17s (1.67 fps), 18252.68 kb/s, Avg QP:24.23

MediaInfo:
Writing library : x265 2.2+36-9b975fec584a:[Windows][GCC 6.2.0][64 bit] 10bit
Encoding settings : cpuid=1050111 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x2160 / interlace=0 / total-frames=178944 / level-idc=50 / high-tier=1 / uhd-bd=0 / ref=1 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / bframes=4 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=40 / lookahead-slices=2 / scenecut=40 / no-intra-refresh / ctu=32 / min-cu-size=8 / no-rect / no-amp / max-tu-size=16 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=4.00 / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / no-strong-intra-smoothing / max-merge=2 / limit-refs=0 / no-limit-modes / me=3 / subme=7 / merange=25 / temporal-mvp / weightp / weightb / no-analyze-src-pics / no-deblock / no-sao / no-sao-non-deblock / rd=3 / early-skip / no-rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=3.00 / no-rd-refine / analysis-mode=0 / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=24.0 / qcomp=0.80 / qpstep=1 / stats-write=0 / stats-read=0 / vbv-maxrate=100000 / vbv-bufsize=100000 / vbv-init=0.9 / crf-max=0.0 / crf-min=0.0 / ipratio=1.10 / pbratio=1.10 / aq-mode=3 / aq-strength=0.60 / no-cutree / zone-count=0 / no-strict-cbr / qg-size=16 / no-rc-grain / qpmax=51 / qpmin=0 / sar=16 / overscan=0 / videoformat=5 / range=0 / colorprim=1 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / max-cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / opt-qp-pps / opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion / no-hdr

Bitrate distribution:


Screenshots comparison:
Source BD3D (left eye) and Blu-Ray Rip Top-Bottom (left eye first) at 4914 second with bitrate 76592 kbps (maximum for this video).
youli is offline   Reply With Quote
Old 23rd February 2017, 18:47   #4791  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
I was told by an Intel fanboy and RyZen 7 reviewer that you need a Core i7 7700K@4.8GHz to catch the performance of RyZen 7@4.0GHz at x265 2nd pass or you could see the same thing reversed.

Will see...
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 23rd February 2017, 19:09   #4792  |  Link
pingfr
Registered User
 
Join Date: May 2015
Posts: 185
Quote:
Originally Posted by NikosD View Post
I was told by an Intel fanboy and RyZen 7 reviewer that you need a Core i7 7700K@4.8GHz to catch the performance of RyZen 7@4.0GHz at x265 2nd pass or you could see the same thing reversed.

Will see...
Would love to see some substantial evidences.
pingfr is offline   Reply With Quote
Old 23rd February 2017, 19:57   #4793  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,810
Quote:
Originally Posted by NikosD View Post
I was told by an Intel fanboy and RyZen 7 reviewer that you need a Core i7 7700K@4.8GHz to catch the performance of RyZen 7@4.0GHz at x265 2nd pass or you could see the same thing reversed.

Will see...
Sound reasonable. AVX2 is about ~1.6x faster than AVX. Thanks to 2xAVX256+FMA Intel can still compete against 8c/16t CPU.

Atak_Snajpera is offline   Reply With Quote
Old 23rd February 2017, 20:00   #4794  |  Link
pingfr
Registered User
 
Join Date: May 2015
Posts: 185
@Atak_Snajpera: You've got the screenshots I've sent you earlier this week in PM?
pingfr is offline   Reply With Quote
Old 23rd February 2017, 20:12   #4795  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
Quote:
Originally Posted by Atak_Snajpera View Post
Sound reasonable. AVX2 is about ~1.6x faster than AVX. Thanks to 2xAVX256+FMA Intel can still compete against 8c/16t CPU.

FMA has nothing to do with x265, because AFAIK x265 uses integers and FMA is for floating point numbers.

Integer AVX2 makes the difference, will see how much.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 23rd February 2017, 20:36   #4796  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,810
x264 and x265 use FMA3. See encoder's output 'using cpu capabilities'
Atak_Snajpera is offline   Reply With Quote
Old 23rd February 2017, 20:41   #4797  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
It doesn't matter if it lists CPU capabilities.

It really matters what exactly instructions x265 can use.

It would be a huge surprise if it could use FMA3 in a large extent or at all.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 23rd February 2017, 21:05   #4798  |  Link
pingfr
Registered User
 
Join Date: May 2015
Posts: 185
Quote:
Originally Posted by NikosD View Post
It doesn't matter if it lists CPU capabilities.

It really matters what exactly instructions x265 can use.

It would be a huge surprise if it could use FMA3 in a large extent or at all.
See encoder's output 'using cpu capabilities'

Using:

verb (used with object), used, using.
1.
to employ for some purpose; put into service; make use of:
to use a knife.

Source: http://www.dictionary.com/browse/using?s=t
pingfr is offline   Reply With Quote
Old 23rd February 2017, 21:08   #4799  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
Age ? 15 ?
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 23rd February 2017, 21:14   #4800  |  Link
pingfr
Registered User
 
Join Date: May 2015
Posts: 185
Quote:
Originally Posted by NikosD View Post
Age ? 15 ?
*shrug*
pingfr is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 23:40.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.