Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#1 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
x265 benchmark on a 7975WX Threadripper
Hello, i've finaly finished making my gig, and made some tests.
CPU: 7975WX (AVX512 & HT enabled in the BIOS) => 32 physical cores (64 logicals). Motherboard: ASUS Pro WS WRX90E-SAGE SE Memory CORSAIR CMA128GX5M8B5600C40(Ver 5.43.01) 8x 16GB I've made a Windows 10/Windows 11 dual boot. PC is totaly offline, a lot of services disabled, so it's doing "almost" nothing else and so there is "almost" no waste of time. Used x265 4.1.0.104, build with LLVM 20.1.4. Zen4 is build with: -Ofast -march=znver4 Zen4s is build with: -Ofast -msse2 -mavx -mavx2 -mfma -mtune=znver4 (So Zen4s is optimised for Zen4 buit without using AVX512 instructions). I've made a lot of differents tests on a small 128 frames 4k 10bits file. The commande line used is (avg bitrate is 40000): Code:
SET E_SRC=%8%1.avs SET E_DST=%5%1.hevc SET CHAPTERS=%8%7 SET STAT_FILE=%8%1.stats SET LOG_FILE_1=%8%1_log_1.txt SET LOG_FILE_2=%8%1_log_2.txt SET LOG_FILE_3=%8%1_log_3.txt SET BITRATE=%2 SET TUNING=%6 SET MCLL=%3 SET MDISPLAY=%4 x265_x64 --asm avx512 --preset slower --vbv-maxrate 90000 --vbv-bufsize 70000 --bitrate %BITRATE% --stats %STAT_FILE% --level 5.1 --profile main10 --high-tier --level-idc 51 --hist-scenecut --fades --aq-mode 4 --aq-auto 6 --weightb --rc-lookahead 72 --tskip --tskip-fast --no-rect --me hex --subme 2 --b-intra --no-sao --deblock -1,-1 --psy-rd 2.5 --psy-rdoq 4 --multi-pass-opt-analysis --multi-pass-opt-distortion --video-signal-type-preset BT2100_PQ_YCC -D 10 --max-cll %MCLL% --master-display %MDISPLAY% --hdr10-opt --qpfile %CHAPTERS% --input %E_SRC% --pass 1 -o NUL 2> %LOG_FILE_1% x265_x64 --asm avx512 --preset slower --vbv-maxrate 90000 --vbv-bufsize 70000 --bitrate %BITRATE% --stats %STAT_FILE% --level 5.1 --profile main10 --high-tier --level-idc 51 --hist-scenecut --fades --aq-mode 4 --aq-auto 6 --weightb --rc-lookahead 72 --tskip --tskip-fast --rect --no-amp --me umh --subme 3 --b-intra --no-sao --deblock -1,-1 --psy-rd 2.5 --psy-rdoq 4 --multi-pass-opt-analysis --multi-pass-opt-distortion --video-signal-type-preset BT2100_PQ_YCC -D 10 --max-cll %MCLL% --master-display %MDISPLAY% --hdr10-opt --qpfile %CHAPTERS% --input %E_SRC% --pass 3 -o NUL 2> %LOG_FILE_2% x265_x64 --asm avx512 --preset slower --vbv-maxrate 90000 --vbv-bufsize 70000 --bitrate %BITRATE% --stats %STAT_FILE% --level 5.1 --profile main10 --high-tier --level-idc 51 --hist-scenecut --fades --aq-mode 4 --aq-auto 6 --weightb --rc-lookahead 72 --tskip --tskip-fast --rect --amp --b-intra --no-sao --deblock -1,-1 --psy-rd 2.5 --psy-rdoq 4 --scenecut-aware-qp 3 --multi-pass-opt-analysis --multi-pass-opt-distortion --video-signal-type-preset BT2100_PQ_YCC -D 10 --max-cll %MCLL% --master-display %MDISPLAY% --hdr10-opt --qpfile %CHAPTERS% --input %E_SRC% --pass 2 -o %E_DST% 2> %LOG_FILE_3% Of course, each time encode is from the same file. Zen4 Windows 10 Pass 1: encoded 128 frames in 45.09s (2.84 fps) Pass 2: encoded 128 frames in 73.77s (1.74 fps) Pass 3: encoded 128 frames in 62.94s (2.03 fps) Windows 11 Pass 1: encoded 128 frames in 43.15s (2.97 fps) Pass 2: encoded 128 frames in 111.11s (1.15 fps) Pass 3: encoded 128 frames in 93.05s (1.38 fps) Zen4s Windows 10 Pass 1: encoded 128 frames in 44.51s (2.88 fps) Pass 2: encoded 128 frames in 73.16s (1.75 fps) Pass 3: encoded 128 frames in 62.09s (2.06 fps) Windows 11 Pass 1: encoded 128 frames in 43.24s (2.96 fps) Pass 2: encoded 128 frames in 111.24s (1.15 fps) Pass 3: encoded 128 frames in 93.54s (1.37 fps) Results: - Zen4 & Zen4s have (almost) identical results. - First pass is a little faster on Windows 11, but Windows 11 is significantly slower on Pass 2 & 3 ! ![]() Zen4s without the --asm AVX512. Windows 10 Pass 1: encoded 128 frames in 48.72s (2.63 fps) [AVX512 +9.5%] Pass 2: encoded 128 frames in 107.07s (1.20 fps) [AVX512 +45,8%] Pass 3: encoded 128 frames in 96.89s (1.32 fps) [AVX512 +56,1%] Windows 11 Pass 1: encoded 128 frames in 54.68s (2.34 fps) [AVX512 +26.5%] Pass 2: encoded 128 frames in 111.77s (1.15 fps) [AVX512 +0.0%] Pass 3: encoded 128 frames in 94.69s (1.35 fps) [AVX512 +1.5%] Results: For Windows 10, the difference is great, but Windows 11... ![]() It's like on Pass 2 & Pass 3 Windows 11 is not using AVX512 ![]() I've noticed using the task manager that even if x265 creates 64 threads (it's notified in the log), the total CPU usage was under 50%. Si I tryed, adding --pools 32, but kept --asm AVX512. Zen4s Windows 10 Pass 1: encoded 128 frames in 44.25s (2.89 fps) Pass 2: encoded 128 frames in 92.07s (1.39 fps) Pass 3: encoded 128 frames in 76.46s (1.67 fps) Results : A little slower (expected), but not so much. So... I said to myself: As i have a lot of memory, if i start 2 encodes in the same time with --pools 32, maybe it could be interesting. Encodes are made from 2 identical files on 2 differents HDD. From 1rst test : 1 file full speed (64 threads): Windows 10: 181,80s => 2 encodes take 363,60s Windows 11: 248,02s => 2 encodes take 496,04s Now, there is 2 encodes in parallel, with --pools 32 & --asm AVX512. Windows 10 File 1: Pass 1: encoded 128 frames in 50.92s (2.51 fps) Pass 2: encoded 128 frames in 114.36s (1.12 fps) Pass 3: encoded 128 frames in 94.96s (1.35 fps) => Total of 260,18s File 2: Pass 1: encoded 128 frames in 51.39s (2.49 fps) Pass 2: encoded 128 frames in 84.04s (1.52 fps) Pass 3: encoded 128 frames in 68.29s (1.87 fps) => Total of 203,73s => 2 files encoded in 260,18s instead of 363,60s => -28%. But at one time, one file get slower, the load was not equal between the files. Windows 11 File 1: Pass 1: encoded 128 frames in 49.73s (2.57 fps) Pass 2: encoded 128 frames in 116.22s (1.10 fps) Pass 3: encoded 128 frames in 96.31s (1.33 fps) => Total of 262,26s File 2: Pass 1: encoded 128 frames in 49.66s (2.58 fps) Pass 2: encoded 128 frames in 115.14s (1.11 fps) Pass 3: encoded 128 frames in 96.03s (1.33 fps) => Total of 260,83s => 2 files encoded in 262,26s instead of 496,04s => -47%. The % gain is better than Windows 10, the load is equal, but nevertheless result is finaly the same than with Windows 10. If this slowdown of Pass 2 & Pass 3 between Windows 10 & Windows 11 could be explained and solved, my guess is that Windows 11 would be better than Windows 10, but for now, it's not the case.
__________________
My github. |
![]() |
![]() |
![]() |
#3 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
Non-deterministic just means that doing 2 times the exact same encode will not produce the exact same result file, it doesn't mean the encoding time will change drasticaly. As i don't care of file result and just encoding time, i don't think this remark is relevant, and don't agree with it.
But... As i also think test results are relevant, i'll do this evening when back home 4 times the exact same test (on both Windows 10 & Windows 11), and see if there is a significant difference in enconding time between each of them. If there is, i was wrong, if not you were wrong.
__________________
My github. Last edited by jpsdr; 10th June 2025 at 08:46. |
![]() |
![]() |
![]() |
#5 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
I'll see the result when back home of launching several time the same test, if time change.
If not, it meens tests are relevants. If yes, i'll do the same test but duplicate 10 times the clip in the avs script, creating 1280 frames, making encoding time between 10 to 20 minutes... And redoing some tests (in that case, probably not so much). I must says, for time saving, that i hope the result will be that time between test will not change... ![]()
__________________
My github. |
![]() |
![]() |
![]() |
#6 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
Back home, results of enconding time consistancy.
Windows 10 #1 Pass 1: encoded 128 frames in 44.16s (2.90 fps) Pass 2: encoded 128 frames in 72.95s (1.75 fps) Pass 3: encoded 128 frames in 61.92s (2.07 fps) #2 Pass 1: encoded 128 frames in 44.20s (2.90 fps) Pass 2: encoded 128 frames in 72.52s (1.76 fps) Pass 3: encoded 128 frames in 62.15s (2.06 fps) #3 Pass 1: encoded 128 frames in 44.25s (2.89 fps) Pass 2: encoded 128 frames in 72.82s (1.76 fps) Pass 3: encoded 128 frames in 62.12s (2.06 fps) #4 Pass 1: encoded 128 frames in 44.43s (2.88 fps) Pass 2: encoded 128 frames in 72.64s (1.76 fps) Pass 3: encoded 128 frames in 62.10s (2.06 fps) Results Pass 1 : vary from 44.16s to 44.43s => 0.6% Pass 1 : vary from 72.52s to 72.95s => 0.6% Pass 3 : vary from 61.92s to 62.15s => 0.4% Windows 11 #1 Pass 1: encoded 128 frames in 43.13s (2.97 fps) Pass 2: encoded 128 frames in 111.58s (1.15 fps) Pass 3: encoded 128 frames in 93.35s (1.37 fps) #2 Pass 1: encoded 128 frames in 44.20s (2.90 fps) Pass 2: encoded 128 frames in 111.47s (1.15 fps) Pass 3: encoded 128 frames in 93.70s (1.37 fps) #3 Pass 1: encoded 128 frames in 43.11s (2.97 fps) Pass 2: encoded 128 frames in 111.42s (1.15 fps) Pass 3: encoded 128 frames in 93.64s (1.37 fps) #4 Pass 1: encoded 128 frames in 43.10s (2.97 fps) Pass 2: encoded 128 frames in 111.52s (1.15 fps) Pass 3: encoded 128 frames in 93.88s (1.36 fps) Results Pass 1 : vary from 43.10s to 44.20s => 2.6% Pass 1 : vary from 111.42s to 111.58s => 0.1% Pass 3 : vary from 93.35s to 93.88s => 0.6% Obviously my results are a lot of things, but NOT subject to high marging of error ! This confirm my statement in post #3. Nevertheless, i'll try, just the case 32 threads/2 encodes in the same time, looping 10 times my small file in the avs script (so 1280 frames) and only one Windows 10 and Windows 11, to see if the CPU load balance is better on a larger file.
__________________
My github. Last edited by jpsdr; 10th June 2025 at 18:53. |
![]() |
![]() |
![]() |
#7 | Link |
Registered User
Join Date: Aug 2024
Posts: 576
|
Then, maybe Hypervisor? Windows 11 is very stubborn on getting that enabled, and other things... for safety (not sure about that).
I mean it's really strange to me, I think Windows 11 should still be very similar to Windows 10 (down in the kernel), without the online bloats running, how can it perform such differently? |
![]() |
![]() |
![]() |
#8 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
@Z2697
What's odd, is that Windows 11 is a little faster on Pass 1, but a lot slower only on Pass 2 & Pass 3. And according the speed test without AVX512, it looks like on Windows 11 AVX512 is disabled just for Pass 2 & Pass 3. But it make no sense... Why would AVX512 be disabled on Pass 2 & Pass 3 on Windows 11 and not on Windows 10... ![]() ================================= Otherwise, test of a 1280 frames files, AVX512, 32 threads, 2 files encoded in the same time. Windows 10 File 1 Pass 1: encoded 1280 frames in 489.71s (2.61 fps) Pass 2: encoded 1280 frames in 881.18s (1.45 fps) Pass 3: encoded 1280 frames in 709.66s (1.80 fps) => Total of 2080.55s File 2 Pass 1: encoded 1280 frames in 490.34s (2.61 fps) Pass 2: encoded 1280 frames in 883.15s (1.45 fps) Pass 3: encoded 1280 frames in 709.87s (1.80 fps) => Total of 2083.36s Result: As i suspected, on this specific test, small file is not accurate to check the CPU load balance, too short in time. With a bigger file, the result shows an excellent CPU balance, with almost identical time for each file, giving a total of 2083.36s for encoding 2 files. If i make a quick computation according speed of one encoding : Pass 1 -> 2.89fps => 442.91s Pass 2 -> 1,75fps => 731.43s Pass 3 -> 2.06fps => 621.36s => Total of 1795.70s -> 3591.40s for 2 files Encoding time : -42% Windows 11 File 1 Pass 1: encoded 1280 frames in 437.46s (2.93 fps) Pass 2: encoded 1280 frames in 1065.31s (1.20 fps) Pass 3: encoded 1280 frames in 938.72s (1.36 fps) => Total of 2441.49s File 2 Pass 1: encoded 1280 frames in 578.49s (2.21 fps) Pass 2: encoded 1280 frames in 1028.20s (1.24 fps) Pass 3: encoded 1280 frames in 897.66s (1.43 fps) => Total of 2504.35s Result: CPU balance is good, but not as good on Windows 10, and time is bigger. Winner : Windows 10 At least for now... For the record, i've a 'tunne' install of Windows 11 with by default a lot of crap disabled, i also disabled a lot of things i'm not using like firewall and defender and a lot of network services as PC is totaly offline. For the record also, i've made a "Windows update" on both of them when i've installed the OS last WE, before making them totaly offline, so they are, normaly, "up to date". @Z2697 I've checked on my Windows 11, Hyper-V is totaly disabled in the Program features. And also on my Windows 10 after checking.
__________________
My github. Last edited by jpsdr; 10th June 2025 at 21:58. |
![]() |
![]() |
![]() |
#9 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
I just thought this morning of a test i'll do this evening when back home.
I've tested with an LLVM build, i'll test with a GCC build. Shouldn't change things, in theory, but at this point...
__________________
My github. |
![]() |
![]() |
![]() |
#10 | Link |
Registered User
Join Date: Aug 2024
Posts: 576
|
Some security settings will enable hypervisor regradless of the checkboxies in the features dialogue.
In fact, I can't find a sane way to disable the hypervisor in Windows 11 24H2. (of course disable SVM in BIOS do the trick) You can run msinfo32 to check if the hypervisor is running. (it will say hypervisor is detected in the bottom row) Although the virtualization should be pretty efficient, there may still be some edge cases. Last edited by Z2697; 11th June 2025 at 16:24. |
![]() |
![]() |
![]() |
#11 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
So...
I've deactivated SVM in the BIOS. I've followed all the guides to disable Hyper-V in Windows. Result : encode speed is a little faster, but, no change in the fact that Pass 2 & 3 are a lot slower in Windows 11 than Windows 10... Also, no speed difference between LLVM and GCC builds. Also, if you know how to permanently disable Defender in Windows 11 i take it !!! I've been able, it seems, to do it under Windows 10, but Windows 11 is... ![]()
__________________
My github. |
![]() |
![]() |
![]() |
#12 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
For now, the only clue i see is this result:
Windows 11 Pass 1: encoded 128 frames in 54.68s (2.34 fps) [AVX512 +26.5%] Pass 2: encoded 128 frames in 111.77s (1.15 fps) [AVX512 +0.0%] Pass 3: encoded 128 frames in 94.69s (1.35 fps) [AVX512 +1.5%] I don't know how it could be possible, but "Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth". So, how improbable it could be, but not impossible, for now the only conclusion i have, is that there is something in the code that prevent the use of AVX512 path specificaly under Windows 11 and not Windows 10, linked to one of the settings i use in Pass 2 and Pass 3. I don't konw if people from Multicoreware read here and can eventualy provide their insights.
__________________
My github. Last edited by jpsdr; 11th June 2025 at 19:27. |
![]() |
![]() |
![]() |
#13 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
I have a little time, so i post results with Hyper-V deactivated:
Windows 10 Pass 1: encoded 128 frames in 43.33s (2.95 fps) => +4.0% Pass 2: encoded 128 frames in 70.98s (1.80 fps) => +3.9% Pass 3: encoded 128 frames in 60.47s (2.12 fps) => +4.1% Speed increase is very stable, small, but it's always good to take. Value is just high enough to not be considered as "noise". Windows 11 Pass 1: encoded 128 frames in 42.84s (2.99 fps) => +0.7% Pass 2: encoded 128 frames in 110.45s (1.16 fps) => +0.6% Pass 3: encoded 128 frames in 92.55s (1.38 fps) => +0.5% Speed increase is also very stable, but... so small that it can be "noise".
__________________
My github. Last edited by jpsdr; 12th June 2025 at 13:34. |
![]() |
![]() |
![]() |
#14 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
@Z2697
Didn't tried yet, but interesting: https://winaerotweaker.com/ https://github.com/ionuttbara/windows-defender-remover Also found (still not tested) this : https://github.com/TairikuOokami/Win...%20Disable.bat
__________________
My github. Last edited by jpsdr; 12th June 2025 at 14:28. |
![]() |
![]() |
![]() |
#15 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
GCC mcf version is supposed to have a Windows 10 optimised threading model, so, i've tested GCC mcf build vs LLVM build.
AVX512, 64 threads. Windows 10 - LLVM Pass 1: encoded 128 frames in 43.30s (2.96 fps) Pass 2: encoded 128 frames in 70.82s (1.81 fps) Pass 3: encoded 128 frames in 60.53s (2.11 fps) Windows 10 - GCC mcf Pass 1: encoded 128 frames in 44.20s (2.90 fps) => -2.0% Pass 2: encoded 128 frames in 72.88s (1.76 fps) => -2.8% Pass 3: encoded 128 frames in 61.88s (2.07 fps) => -2.2% LLVM wins. Windows 11 - LLVM Pass 1: encoded 128 frames in 42.80s (2.99 fps) Pass 2: encoded 128 frames in 111.23s (1.15 fps) Pass 3: encoded 128 frames in 93.47s (1.37 fps) Windows 11 - GCC mcf Pass 1: encoded 128 frames in 43.97s (2.91 fps) => -2.7% Pass 2: encoded 128 frames in 112.88s (1.13 fps) => -1.5% Pass 3: encoded 128 frames in 95.16s (1.35 fps) => -1.8% LLVM wins. Around 2% is not a big deal, but as i said, everything is good to take.
__________________
My github. |
![]() |
![]() |
![]() |
#16 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,822
|
You'll want to try znver2 instead of znver4 and enable AVX512 separately if that's possible. Znver3 and znver4 are both broken in LLVM and actually produce slower binaries than znver2.
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
![]() |
![]() |
![]() |
#19 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,450
|
No, can't try Linux.
I'll try zenver2 with AVX512 compile options enabled, when i have time, but even if broken, LLVM is still faster than GCC.
__________________
My github. Last edited by jpsdr; Yesterday at 20:26. |
![]() |
![]() |
![]() |
#20 | Link |
Registered User
Join Date: May 2009
Posts: 347
|
I am sure many would like to see you run the advanced benchmark sagittare created..
https://forum.doom9.org/showthread.php?t=185855 |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|