Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#6661 | Link |
Registered User
Join Date: Jun 2016
Posts: 116
|
@excellentswordfight
I've come to the same conclusion as well. merange should be based off of the CTU. If you leave CTU at 64 then merange should also be reduced. It'll be searching outside of it's original block. Not to say that is bad necessarily though. I've seen certain "high quality" encodes that use meranges larger than the CTU. I'm not 100% sold that it provides tangible differences thoughs. I personally drop merange to 26 when using CTU 32. Since I never plan on using hex search I should probably change merange to 58 when using CTU64... Best way to find out is to try it for yourself. |
![]() |
![]() |
![]() |
#6662 | Link | |
Registered User
Join Date: Jun 2016
Posts: 116
|
Quote:
Intel's top of the line 18 core CPU only has 2 AVX512 units on it. Imagine a Zen 2 CPU with up to 8 AVX512 fused units!!! I know there is an extra cycle or two when doing a fused operation but still up to 8 AVX512 operations will be nice! |
|
![]() |
![]() |
![]() |
#6663 | Link | ||
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 6,206
|
Quote:
Quote:
![]() Questions are: a. is '--aq-strength' an option for both '--aq-mode' and '--hevc-aq' or just '--aq-mode'? b. is '--aq-adaption-range' an option for both '--aq-mode' and '--hevc-aq' or just '--hevc-aq'? Cu Selur |
||
![]() |
![]() |
![]() |
#6664 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,044
|
Quote:
256-bit AVX on current Ryzen isn't that much faster then 128-bit SSE due to that. In any case, there have been zero hints about AVX512 support.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders |
|
![]() |
![]() |
![]() |
#6665 | Link | |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 3,536
|
Quote:
Generally the value of AVX? instructions have improved over time, as microarchitecture improvements help with thermal throttling and other bottlenecks. |
|
![]() |
![]() |
![]() |
#6666 | Link | |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 3,536
|
Quote:
c. --aq-strength is an parameter for --aq-mode, and --aq-adaption-range is a parameter for --hevc-aq, and neither is used when the other aq type is used. |
|
![]() |
![]() |
![]() |
#6667 | Link | |
Pig on the wing
Join Date: Mar 2002
Location: Hollola, Finland
Posts: 4,990
|
Quote:
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
|
![]() |
![]() |
![]() |
#6668 | Link | |
Registered User
Join Date: Dec 2002
Location: Region 0
Posts: 1,379
|
Quote:
The whole FPU got supersized in Zen 2 (vs. 1). 2x wider datapath (256-bit, up from 128-bit) 2x wider EUs (256-bit FMAs, up from 128-bit FMAs) 2x wider LSU (2x256-bit L/S, up from 128-bit) from: https://en.wikichip.org/wiki/amd/mic...tectures/zen_2 |
|
![]() |
![]() |
![]() |
#6669 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Another important advantage of Zen 2 architecture and its implementation of Ryzen 3000 series, will be the CPU clock during heavy execution of AVX2 instructions, like running x265 app.
If AMD has been interpreted correctly, we will see no performance penalty due to lower clocks during x265 AVX2 code execution. Intel sees lower clocks leveraging AVX2 instructions of x265 with all CPU architectures, so far. I think Ryzen 3000 (and Threadripper, EPYC) based on Zen 2 architecture has all the benefits to be a lot faster than any Intel CPU ever released with the same number of cores. And due to the fact that all AMD CPUs have more cores than Intel nowadays, then x265 could be a killer app for AMD, like Cinebench.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
![]() |
![]() |
![]() |
#6670 | Link | |
Registered User
Join Date: Dec 2014
Posts: 8
|
Quote:
|
|
![]() |
![]() |
![]() |
#6671 | Link |
Registered User
Join Date: Mar 2004
Posts: 1,011
|
HEVC licensing info article: http://www.streamingmedia.com/Articl...te-129386.aspx
|
![]() |
![]() |
![]() |
#6672 | Link | |
Registered User
Join Date: Aug 2016
Posts: 60
|
Quote:
|
|
![]() |
![]() |
![]() |
#6673 | Link |
Registered User
Join Date: Feb 2007
Location: Sweden
Posts: 364
|
x265 v3.0_RC+13-ae085e5cd8a2 (32 & 64-bit 8/10/12bit Multilib Windows Binaries) (32bit : GCC 7.4.0 / 64bit : GCC 8.2.1)
Code:
https://bitbucket.org/multicoreware/x265/commits/branch/default Checked with Pradeep (@MulticoreWare) about why the Default Branch haven't been pushed to v3.0 'Stable' and this is the reply/info i got " Our plan is to continue to use 3.0_RC on the default branch and have completed tags only on the stable branch. So we don't intend to merge back. " |
![]() |
![]() |
![]() |
#6674 | Link | ||
Registered User
Join Date: Nov 2009
Posts: 345
|
Quote:
On Skylake-X/WS and Cannon Lake, AVX-512 only ever increases performance. It will presumably also be the case on Ice Lake. Quote:
This is only true on server CPU. There is no separate frequency for AVX on client CPU (7700K, 8700K, 9900K, etc.). The extent to which you may see "lower clocks" on non-server is if you are encoding faster and running into the 65/95 W power limit, and in that case AMD will be no different. Last edited by Stephen R. Savage; 30th January 2019 at 06:29. |
||
![]() |
![]() |
![]() |
#6675 | Link | |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
Probably 7nm could help too, keeping the same clocks for AVX2 like all the other instruction sets.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
![]() |
![]() |
![]() |
#6676 | Link | |
Registered User
Join Date: Jul 2018
Posts: 164
|
Quote:
AVX2 - 7.08 fps AVX512 - 7.87 fps Another 1080p encoding with the same preset slower + ctu 32: AVX2 - 7.68 fps AVX512 - 8.08 fps Also: AVX2 ~ 290W AVX512 ~ 250W I'm using adaptive offset for vcore. So my vcore is 1.24v for @4800 (non-avx) and when encoding with AVX2 my core speed is @4500 but vcore remains the same 1.24v. When encoding with AVX512 my core speed is @4500 and vcore is 1.13v. |
|
![]() |
![]() |
![]() |
#6677 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,044
|
Quote:
You wouldn't notice this problem with x264 or x265, because its AVX512 usage is pretty "light", but if you run some heavy AVX512 tasks on all cores, and don't configure an appropriate offset, the chip just crashes. The energy density of the AVX512 units is just too high for running at full turbo clocks, nevermind OCed. If x265 is the only AVX512 you ever run, and you want to risk it, sure, you can disable the offset and hope that it never happens. But I prefer to know that my system is stable no matter what software does. ![]() But be careful, and do know that you can not judge the requirement from one pretty lightweight workload. The only way the offset is getting lower is when the cores get more efficient, which they really only do on a process shrink. So hopefully that'll significantly reduce the AVX512 offset, even if I don't expect it to go away quite just yet. This is easily testable by anyone with such a chip. For example, recent versions of the Intel LINPACK floating-point benchmark will put enough AVX-512 stress on the CPU to cause this.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders Last edited by nevcairiel; 30th January 2019 at 10:25. |
|
![]() |
![]() |
![]() |
#6678 | Link | |
Lost my old account :(
Join Date: Jul 2017
Posts: 170
|
Quote:
And his statment is true for xeons. I did some tests on a Xeon Gold 6126 and even got lower performance for 2160p preset slow, clockspeeds down almost 20%, while gains of running avx512 gave maybe 10%. OCd X299 platforms are a niche (altough maybe not here). Last edited by excellentswordfight; 30th January 2019 at 16:47. |
|
![]() |
![]() |
![]() |
#6679 | Link | ||
Registered User
Join Date: Nov 2009
Posts: 345
|
Quote:
Quote:
Last edited by Stephen R. Savage; 30th January 2019 at 18:10. |
||
![]() |
![]() |
![]() |
#6680 | Link | |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
Do you really hear from me for the first time that most non-SIMD code can't utilize a modern CPU in the way that SIMD code can ? The only way to reach TDP limits of a modern CPU is from optimized SIMD code. There are other limits to reach before power limits for non-AVX code.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|