Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 16th April 2016, 09:29   #1  |  Link
kotuwa
Registered User
 
Join Date: May 2012
Posts: 66
x265 performance of different builds! (GCC,ICC,VS/VC)

Regarding windows binary builds of x265,

GCC, ICC, VS / V C++

0. Are there any other builds?

1. What are the speed/performance based differences?

2. Which build suits better for which system?

3. Are there any quality/size based differences too?

!?
kotuwa is offline   Reply With Quote
Old 16th April 2016, 11:38   #2  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 696
About a little imprecise. Nothing written on chromasubsampling i420 or i444. If the i444 is HEVC miserably looks at high compression for frames I. What kind of CPU you have?
I have the old i5 2500:
Ad0 I don't know
Ad1 They are differences at the expense of quality. Slowest falls VC ++. Even slower is all encoders in one 8+10+12, ie. In the Hybrid
Ad2 I have GCC in Windows 10. The ICC is also tolerably.
Ad3 Yes they are. You should check yourself. I checked for version 1.7.
Jamaika is offline   Reply With Quote
Old 16th April 2016, 11:47   #3  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,336
Quote:
Originally Posted by kotuwa View Post
3. Are there any quality/size based differences too?
If the compiler impacts the output of an encoder, that sounds like a bug, and you should report that to the developers.

So in general, no, there should be no differences in output no matter how you build it, assuming all builds use the same configuration.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 16th April 2016, 12:27   #4  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
0. In theory it is possible to build x265 for Windows with clang.

1. From my speed tests, if your CPU doesn't have SSE4, the fastest are GCC 6 builds, otherwise VS 2015 builds.

2. For current CPU (AVX/AVX2) the best are VS 2015 builds. Windows version isn't important if it is Windows 7 64bit or newer 64bit.

3. Yes, there are (but very small). In file source/encoder/sao.cpp some decisions are made according to floating point computations that depends on compiler/optimize options.
Ma is offline   Reply With Quote
Old 17th April 2016, 16:50   #5  |  Link
kotuwa
Registered User
 
Join Date: May 2012
Posts: 66
Quote:
Originally Posted by Jamaika View Post
Ad3 Yes they are. You should check yourself. I checked for version 1.7.
I checked small samples. Couldn't check x264 info/statistics, though!
File sizes were almost same, the slight difference was several bytes, I thought it is due to build info string....


Quote:
Originally Posted by nevcairiel View Post
If the compiler impacts the output of an encoder, that sounds like a bug, and you should report that to the developers.

So in general, no, there should be no differences in output no matter how you build it, assuming all builds use the same configuration.
Are you sure? Other 2 replies says otherwise!


Are used instruction sets has impact on size/quality?
SSE4, AVX2 etc?
And does the build has effect on those?


Also another question, what kind of systems benefit by using ICC builds?
kotuwa is offline   Reply With Quote
Old 17th April 2016, 22:48   #6  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,336
Quote:
Originally Posted by kotuwa View Post

Are you sure? Other 2 replies says otherwise!
Minor floating point differences really shouldn't result in a noticeable quality difference - if they do that should still be investigated by the developers.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 22nd April 2016, 20:54   #7  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
Well I tested vs2015 vs gcc 5.2.0 and on my Bloomfield gcc ones was faster.
__________________
powered by Google Translator
Motenai Yoda is offline   Reply With Quote
Old 22nd April 2016, 21:17   #8  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
Quote:
Originally Posted by Motenai Yoda View Post
Well I tested vs2015 vs gcc 5.2.0 and on my Bloomfield gcc ones was faster.
For VS 2015 before build helps:
set CXXFLAGS=/GS- /GL

Did you use /GS- /GL options?
Ma is offline   Reply With Quote
Old 24th April 2016, 18:44   #9  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
with set CXXFLAGS=/GS- /GL it throw me an error about some target cpu isn't the same back end and front end???

using the batch included on my cpu:

Code:
fps             8bit     10bit     12bit
MSVC 1800       9.02     7.05      5.19
GCC 5.3.0       9.02     7.06      4.67
__________________
powered by Google Translator

Last edited by Motenai Yoda; 24th April 2016 at 21:03.
Motenai Yoda is offline   Reply With Quote
Old 24th April 2016, 19:53   #10  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
So the problem is only at 10bit encoding.

My emulation of 10bit encoding with your CPU (SSE4.2 level) on i5 3450S, x265- is compiled without /GS- /GL options, x265 with:
Code:
i:\speed\1.9+141>x265- --asm=SSE4.2 ../ducks_take_off_1080p50.y4m w.hevc
y4m  [info]: 1920x1080 fps 50/1 i420p8 sar 1:1 frames 0 - 499 of 500
raw  [info]: output file: w.hevc
x265 [info]: HEVC encoder version 1.9+141-02d79be487d7
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
x265 [info]: Main 10 profile, Level-4.1 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: frame threads / pool features       : 2 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 2
x265 [info]: Keyframe min / max / scenecut       : 25 / 250 / 40
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / on / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 signhide tmvp strong-intra-smoothing
x265 [info]: tools: lslices=6 deblock sao
x265 [info]: frame I:      2, Avg QP:35.89  kb/s: 36347.20
x265 [info]: frame P:    123, Avg QP:36.65  kb/s: 27657.64
x265 [info]: frame B:    375, Avg QP:39.35  kb/s: 4933.84
x265 [info]: Weighted P-Frames: Y:18.7% UV:12.2%
x265 [info]: consecutive B-frames: 0.8% 0.0% 0.0% 96.8% 2.4%

encoded 500 frames in 85.48s (5.85 fps), 10649.55 kb/s, Avg QP:38.67

i:\speed\1.9+141>x265 --asm=SSE4.2 ../ducks_take_off_1080p50.y4m w.hevc
y4m  [info]: 1920x1080 fps 50/1 i420p8 sar 1:1 frames 0 - 499 of 500
raw  [info]: output file: w.hevc
x265 [info]: HEVC encoder version 1.9+141-02d79be487d7
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
x265 [info]: Main 10 profile, Level-4.1 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: frame threads / pool features       : 2 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 2
x265 [info]: Keyframe min / max / scenecut       : 25 / 250 / 40
x265 [info]: Lookahead / bframes / badapt        : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / on / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 signhide tmvp strong-intra-smoothing
x265 [info]: tools: lslices=6 deblock sao
x265 [info]: frame I:      2, Avg QP:35.89  kb/s: 36347.20
x265 [info]: frame P:    123, Avg QP:36.65  kb/s: 27657.64
x265 [info]: frame B:    375, Avg QP:39.35  kb/s: 4933.84
x265 [info]: Weighted P-Frames: Y:18.7% UV:12.2%
x265 [info]: consecutive B-frames: 0.8% 0.0% 0.0% 96.8% 2.4%

encoded 500 frames in 84.64s (5.91 fps), 10649.55 kb/s, Avg QP:38.67
VS 2015 builds compiled with /GS- /GL options you can download from www.msystem.waw.pl/x265
Ma is offline   Reply With Quote
Old 24th April 2016, 21:02   #11  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
wait... the batch included compile with 2013... (but I don't have vs2013 )

tried with vs2015 and set CXXFLAGS=/GS- /GL
Code:
Compilazione completata.
    Avvisi: 0
    Errori: 0

Tempo trascorso 00:02:44.06
        1 file spostato/i.
Microsoft (R) Library Manager Version 14.00.23918.0
Copyright (C) Microsoft Corporation.  All rights reserved.

x265-static-main.lib(analysis.obj) : trovato .netmodule MSIL o modulo compilato con /GL; il collegamento verrą riavviato con l'opzione /LTCG; aggiungere /LTCG alla riga di comando del collegamento per migliorare le prestazioni del linker
Microsoft (R) Library Manager Version 14.00.23918.0
Copyright (C) Microsoft Corporation.  All rights reserved.

fatal error C1905: Front end e back end non compatibili (il processore di destinazione deve essere lo stesso).
LINK : fatal error LNK1257: generazione codice non riuscita
with vs2015 and set cxxflags
8bit 9.02 / 10bit 6.99 / 12bit 3.70
__________________
powered by Google Translator

Last edited by Motenai Yoda; 24th April 2016 at 21:42.
Motenai Yoda is offline   Reply With Quote
Old 24th April 2016, 21:21   #12  |  Link
Ma
Registered User
 
Join Date: Feb 2015
Posts: 326
I see you build multilib version. The compilation is OK and it should be x265.exe that works OK at 8- and 10-bit encoding (and wrong at 12-bit).

The error is from part:
:: combine static libraries (ignore warnings caused by winxp.cpp hacks)
move Release\x265-static.lib x265-static-main.lib
LIB.EXE /ignore:4006 /ignore:4221 /OUT:Release\x265-static.lib x265-static-main.lib x265-static-main10.lib x265-static-main12.lib

which is not important. You can use compiled x265.exe without problem (only avoid 12-bit encoding with multilib version compiled with LTO -- there are bugs in x265 source code).
Ma is offline   Reply With Quote
Old 16th April 2017, 16:24   #13  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,484
Someone can make built for:

-x264 GCC 7.0 "None"
-x264 GCC 7.0 "SSE4"
-x264 ICC 17
-x264 VS 2017 "AVX2"

-x265 GCC 7.0 "None"
-x265 GCC 7.0 "SSE4"
-x265 ICC 17
-x265 VS 2017 "AVX2"

I will produce automatic script benchmark for compare all x264 and x265 build and after choose the best for your CPU and make complete benchmark.

Here i can find some build:
http://msystem.waw.pl/x265/

but not:

-x264 GCC 7.0 "None"
-x264 GCC 7.0 "SSE4"
-x264 ICC 17
-x264 VS 2017 "AVX2"

-x265 ICC 17

THX
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9

Last edited by Sagittaire; 16th April 2017 at 16:29.
Sagittaire is offline   Reply With Quote
Old 16th April 2017, 20:28   #14  |  Link
easyfab
Registered User
 
Join Date: Jan 2002
Posts: 332
And it won't be the "best" for each CPU.
For example for my I7-2600K with GCC, I use -march=native, -Ofast, PGO and others agressive settings and extra flags like -frename-registers, -funroll-loops ... And It give me another 5 to 10% speed boost. Even better for my cpu than VS2017 profiled version from Ma. But you must do these for each CPU.
easyfab is offline   Reply With Quote
Old 16th April 2017, 20:34   #15  |  Link
easyfab
Registered User
 
Join Date: Jan 2002
Posts: 332
I'm really curious to see what a profiled and optimized build can give for Ryzen. Perhaps more speed boost than for intel ?
And better if some code can be rewritten for ryzen architecture.
easyfab is offline   Reply With Quote
Reply

Tags
builds, gcc, icc, performance, x265

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:10.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.