Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
24th September 2019, 18:26 | #7041 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,730
|
And my question that I posted to the x265 mailing list still has not been approved so I don't know if any dev saw the question regarding the motion search methods.
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
24th September 2019, 19:27 | #7042 | Link |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 7,277
|
did a quick test trying to reproduce this here (did no filtering, just decoded with ffmpeg and piped to x265), for me too uhm was slower than star, but the speeds were similar.
--me star: Code:
"I:\Hybrid\64bit\x265.exe" --preset medium --input - --output-depth 10 --y4m --profile main10 --me star --limit-modes --no-early-skip --no-open-gop --opt-ref-list-length-pps --lookahead-slices 0 --crf 18.00 --opt-qp-pps --cbqpoffs -2 --crqpoffs -2 --limit-refs 0 --ssim-rd --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 10.00 --aq-mode 0 --deblock=-1:-1 --limit-sao --no-repeat-headers --range limited --colormatrix bt709 --output "E:\Temp\19_56_10_6810_02.265" --me uhm: Code:
"I:\Hybrid\64bit\x265.exe" --preset medium --input - --output-depth 10 --y4m --profile main10 --me umh --limit-modes --no-early-skip --no-open-gop --opt-ref-list-length-pps --lookahead-slices 0 --crf 18.00 --opt-qp-pps --cbqpoffs -2 --crqpoffs -2 --limit-refs 0 --ssim-rd --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 10.00 --aq-mode 0 --deblock=-1:-1 --limit-sao --no-repeat-headers --range limited --colormatrix bt709 --output "E:\Temp\20_03_51_9010_02.265" I used: Code:
x265 [info]: HEVC encoder version 3.1+20-913823aa15cd x265 [info]: build info [Windows][GCC 9.2.0][64 bit] 10bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 Ran the calls a second time this time I got: --me star: "encoded 2868 frames in 320.13s (8.96 fps), 4606.13 kb/s, Avg QP:23.78" --me umh: "encoded 2868 frames in 322.27s (8.90 fps), 4605.07 kb/s, Avg QP:23.78" -> So here too, uhm is slower, but not a lot, so this might still be source depended. |
25th September 2019, 05:39 | #7043 | Link | |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,730
|
Quote:
https://xevc.wordpress.com/2014/01/2...of-hm-encoder/
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
|
25th September 2019, 15:51 | #7045 | Link |
Registered User
Join Date: Feb 2007
Location: Sweden
Posts: 483
|
x265 v3.2+3-fdd69a766881 (32 & 64-bit 8/10/12bit Multilib Windows Binaries) (GCC 9.2.0)
Code:
https://bitbucket.org/multicoreware/x265/commits/branch/default |
26th September 2019, 07:35 | #7047 | Link |
German doom9/Gleitz SuMo
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,781
|
The speed of "far" motion search algorithms depends on the average amount of motion. Some of them can terminate early if the optimum is found close to an intermediate step. Disclaimer: If I understood these algorithms correctly...
|
26th September 2019, 21:52 | #7048 | Link | |
Registered User
Join Date: Feb 2015
Posts: 326
|
Quote:
|
|
27th September 2019, 04:43 | #7049 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,730
|
This is what I've used for that one:
Code:
--input - --y4m --input-depth 16 --dither --sar 1:1 --profile main10 --rc-lookahead 120 --min-keyint 5 --keyint 480 --splitrd-skip --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --preset slower --rd-refine --subme 4 --ctu 32 --qg-size 16 --limit-refs 1 --limit-tu 3 --bframes 16 --deblock -2:-1 --no-sao --cbqpoffs -3 --crqpoffs -3 --hme --hme-search umh,umh,star --merange 26 --qcomp 0.7 --max-merge 2 --aq-mode 3 --aq-strength 0.6 --crf 18.5
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
28th September 2019, 17:57 | #7053 | Link |
Registered User
Join Date: Feb 2010
Location: Spain
Posts: 549
|
Hi,
Possible bug: Commit 21db162 (https://bitbucket.org/multicoreware/...4a59eb7326b46a) causes slowdown even is not used aq-mode 4 and output is identical. Command line used: Code:
"T:\TEST\x265\3108\x265_x64.exe" - --y4m --frames 1000 --crf 20.0 --preset "medium" --aq-mode 3 --keyint 240 --no-open-gop --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --sar 1:1 --output "T:\TEST\encode.265" Code:
dec [info]: Intel Quick Sync: API LEVEL 1.29, HW dec [info]: 1920x1080, YV12, 24000/1001 fps, 1000 frames y4m [info]: 1920x1080 fps 24000/1001 i420p8 sar 1:1 unknown frame count raw [info]: output file: T:\TEST\encode.265 x265 [info]: HEVC encoder version 3.1+7-147fb92c5ed5 x265 [info]: build info [Windows][GCC 9.2.0][64 bit] 8bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 x265 [info]: Main profile, Level-4 (Main tier) x265 [info]: Thread pool created using 8 threads x265 [info]: Slices : 1 x265 [info]: frame threads / pool features : 3 / wpp(17 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3 x265 [info]: Keyframe min / max / scenecut / bias: 23 / 240 / 40 / 5.00 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / off / on x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-20.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao x265 [info]: frame I: 11, Avg QP:16.20 kb/s: 6958.02 x265 [info]: frame P: 275, Avg QP:20.60 kb/s: 3482.34 x265 [info]: frame B: 714, Avg QP:25.57 kb/s: 356.63 x265 [info]: Weighted P-Frames: Y:10.9% UV:6.5% x265 [info]: consecutive B-frames: 22.4% 4.5% 4.5% 38.1% 30.4% encoded 1000 frames in 17.71s (56.47 fps), 1288.81 kb/s, Avg QP:24.10 Code:
dec [info]: Intel Quick Sync: API LEVEL 1.29, HW dec [info]: 1920x1080, YV12, 24000/1001 fps, 1000 frames y4m [info]: 1920x1080 fps 24000/1001 i420p8 sar 1:1 unknown frame count raw [info]: output file: T:\TEST\encode.265 x265 [info]: HEVC encoder version 3.1+8-21db162c8622 x265 [info]: build info [Windows][GCC 9.2.0][64 bit] 8bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 x265 [info]: Main profile, Level-4 (Main tier) x265 [info]: Thread pool created using 8 threads x265 [info]: Slices : 1 x265 [info]: frame threads / pool features : 3 / wpp(17 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3 x265 [info]: Keyframe min / max / scenecut / bias: 23 / 240 / 40 / 5.00 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / off / on x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-20.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao x265 [info]: frame I: 11, Avg QP:16.20 kb/s: 6958.02 x265 [info]: frame P: 275, Avg QP:20.60 kb/s: 3482.34 x265 [info]: frame B: 714, Avg QP:25.57 kb/s: 356.63 x265 [info]: Weighted P-Frames: Y:10.9% UV:6.5% x265 [info]: consecutive B-frames: 22.4% 4.5% 4.5% 38.1% 30.4% encoded 1000 frames in 19.56s (51.13 fps), 1288.81 kb/s, Avg QP:24.10 |
29th September 2019, 15:50 | #7056 | Link |
Registered User
Join Date: Feb 2010
Location: Spain
Posts: 549
|
I found the cause:
Some code related to AQ mode 4 is executed always. This patch restores previous performance and not break anything (i think). On file slicetype.cpp line 481 replace Code:
#define AQ_EDGE_BIAS 0.5 #define EDGE_INCLINATION 45 uint32_t numCuInHeight = (maxRow + param->maxCUSize - 1) / param->maxCUSize; int maxHeight = numCuInHeight * param->maxCUSize; intptr_t stride = curFrame->m_fencPic->m_stride; pixel *edgePic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); pixel *gaussianPic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); pixel *thetaPic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); memset(edgePic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); memset(gaussianPic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); memset(thetaPic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); if (param->rc.aqMode == X265_AQ_EDGE) edgeFilter(curFrame, edgePic, gaussianPic, thetaPic, stride, maxRow, maxCol); int blockXY = 0, inclinedEdge = 0; Code:
#define AQ_EDGE_BIAS 0.5 #define EDGE_INCLINATION 45 pixel *edgePic = NULL; pixel *gaussianPic = NULL; pixel *thetaPic = NULL; if (param->rc.aqMode == X265_AQ_EDGE) { uint32_t numCuInHeight = (maxRow + param->maxCUSize - 1) / param->maxCUSize; int maxHeight = numCuInHeight * param->maxCUSize; intptr_t stride = curFrame->m_fencPic->m_stride; edgePic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); gaussianPic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); thetaPic = X265_MALLOC(pixel, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2))); memset(edgePic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); memset(gaussianPic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); memset(thetaPic, 0, stride * (maxHeight + (curFrame->m_fencPic->m_lumaMarginY * 2)) * sizeof(pixel)); edgeFilter(curFrame, edgePic, gaussianPic, thetaPic, stride, maxRow, maxCol); } int blockXY = 0, inclinedEdge = 0; Code:
pixel *edgeImage = edgePic + curFrame->m_fencPic->m_lumaMarginY * stride + curFrame->m_fencPic->m_lumaMarginX; pixel *edgeTheta = thetaPic + curFrame->m_fencPic->m_lumaMarginY * stride + curFrame->m_fencPic->m_lumaMarginX; Code:
pixel *edgeImage = edgePic + curFrame->m_fencPic->m_lumaMarginY * curFrame->m_fencPic->m_stride + curFrame->m_fencPic->m_lumaMarginX; pixel *edgeTheta = thetaPic + curFrame->m_fencPic->m_lumaMarginY * curFrame->m_fencPic->m_stride + curFrame->m_fencPic->m_lumaMarginX; Code:
X265_FREE(edgePic); X265_FREE(gaussianPic); X265_FREE(thetaPic); Code:
if (param->rc.aqMode == X265_AQ_EDGE) { X265_FREE(edgePic); X265_FREE(gaussianPic); X265_FREE(thetaPic); } And binary with this patch applied: x265 v3.2 patched x64 GCC 9.2.0 |
30th September 2019, 07:52 | #7057 | Link |
German doom9/Gleitz SuMo
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,781
|
You should send your patch to the x265 Developer mailing list. They prefer it in "diff" format and in the mail body.
_ x265 3.2+3-fdd69a766881 (MSYS2/MinGW, GCC 9.2.0) Last edited by LigH; 30th September 2019 at 15:27. |
30th September 2019, 18:14 | #7059 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,730
|
I couldn't replicate that difference between the two versions on my 3700X. I used the VS 2019 AVX2 builds from http://www.msystem.waw.pl/x265/
With 3.1+7, encoded 2000 frames in 461.37s (4.33 fps), 3467.58 kb/s, Avg QP:20.97 With 3.1+8, encoded 2000 frames in 459.33s (4.35 fps), 3467.58 kb/s, Avg QP:20.97
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
30th September 2019, 18:18 | #7060 | Link |
Registered User
Join Date: Mar 2015
Posts: 8
|
x265 not encoding the whole video
I have been having this issue; I give my x265 a movie and the x265 chooses how much it wants to encode. I give it a 45 minute show, it does 30 minutes. I give it a 1:40 hour movie, it decides not to do the last 10 minutes. I say it decides because it is not erroring. It literally finishes, mux into a container and says, Done! Why does it do this? Never had this problem before and the last 6 months maybe been having this issue randomly, off and on. Any ideas?
|
Thread Tools | Search this Thread |
Display Modes | |
|
|