Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se |
|
|
#3622 | Link |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
New build.
Avisynth r4507 https://github.com/pinterf/AviSynthP...3.7.6pre-r4507 Full change log: https://avisynthplus.readthedocs.io/...gelist376.html Code:
20260213 3.7.5.r4507 (pre 3.7.6)
--------------------------------
Fix Layer "add" 8 bit, regression in r4504
20260212 3.7.5.r4504 (pre 3.7.6)
--------------------------------
* Fix: inaccurate ColorBarsHD 10+ bit values. Now they are derived from the 32-bit float
RGB definitions instead of upscaling a 8 bit precalculated YUV value.
Add Ramp section the lead-in-lead-out.
* Fix: GreyScale + SSE2 + RGB32 + matrix="RGB" overflow.
Rare usage; "RGB" matrix (Identity) uses a 1.0 coefficient which exceeds the signed 16-bit
SIMD limit of 32767 at 15-bit precision. Added bounds checking to fallback to C-code for any
coefficients >= 1.0 or < −1.0.
* Fix: YUV->RGB limited range matrix accuracy for 10-16 bits.
* Use a different rounding in matrix coefficient's integer approximation.
* "ConvertToPlanarRGB": ``bits`` parameter: on-the-fly bit-depth conversions to YUV->RGB conversion.
- Full range target: 8-16 bits internal calculation is in 32-bit float.
- Limited range target: a quicker, bit accuracy optimized integer calculation path.
* Not Fixed: Speed degradation when in-constructor GetFrame(0) (e.g. frame-property getter)
is used. Disable internal Cache object creation. Does not work in complex scripts, preparation
is 5-10 min instead of <1 sec. Investigation continues (Issue #476: https://github.com/AviSynth/AviSynthPlus/issues/476)
* Avoid MTGuard and CacheGuard creation if filter returns one of its clip parameter unaltered.
* Add some avx2 stuff to Layer and Invert
* Optimization: Overlay "Blend": aarch64 NEON optimization
20260203 3.7.5.r4483 (pre 3.7.6)
--------------------------------
* rst documentation update: RGBAdjust https://avisynthplus.readthedocs.io/en/latest/avisynthdoc/corefilters/adjust.html
* rst documentation update: ColorYUV https://avisynthplus.readthedocs.io/en/latest/avisynthdoc/corefilters/coloryuv.html
* optimization: add AVX2 TurnLeft/TurnRight/Turn180 (R/L: 1,5-3x speed).
* optimization: ConvertBits AVX2 integer->float
* optimization: ConvertToPlanarRGB(A): YUV->RGB add AVX2 (2-3x speed)
* optimization: ConvertToPlanarRGB(A): YUV->RGB 16 bit: a quicker way (1,5x)
* Fix: C version of 32-bit ConvertToPlanarRGB YUV->RGB to not clamp output RGB values.
* ConvertToPlanarRGB(A): add bits parameter to alter target bit-depth.
* ConvertToPlanarRGB(A): from YUV->RGB full range output: optimized in-process when bits=32, other cases call ConvertBits internally.
* Fix: Packed RGB conversions altering the bit-depth (e.g. rgb32->ConvertToRGB64() worked always in full range.
* Add more AVX512 resampler code. (WIP)
* Add more AVX512_BASE code paths (Resamplers)
* Build: add _avx512b.cpp/hpp pattern in CMake to detect source to compile with base (F,CD,BW,DQ,VL) flags.
However AVX512_BASE itself is set only when AVX512_FAST found.
For pre-Ice Lake (older AVX512) systems you can enable it with SetMaxCPU("avx512base+") and get the optimized AVX512_BASE functions.
* Build: add new architecture z/Architecture
Last edited by pinterf; 14th February 2026 at 00:42. Reason: r4507 with hotfix |
|
|
|
|
|
#3623 | Link | |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
Quote:
https://avisynthplus.readthedocs.io/...n_objects.html EDIT: or in function parameter https://avisynthplus.readthedocs.io/...nalfilter.html and in Avisynth source (runtime functions accept function objects) https://github.com/AviSynth/AviSynth...tional.cpp#L61 Last edited by pinterf; 12th February 2026 at 16:03. |
|
|
|
|
|
|
#3624 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
https://github.com/pinterf/AviSynthP...dc7e3c00b321c8
I don't know why are minor DEBUG fixes added? Code:
cache.cpp:745:146: error: macro "_RPT3" passed 6 arguments, but takes just 5
745 | _RPT3(0, "CacheGuard::SetCacheHints called. cache=%p hint=%d (%s) frame_range=%d\n", (void*)this, cachehints, hintname.c_str(), frame_range); // P.F.
| ^
In file included from cache.h:38,
from cache.cpp:35:
avisynth.h:157: note: macro "_RPT3" defined here
157 | #define _RPT3(a,b,c,d,e) ((void)0)
|
cache.cpp: In member function 'virtual int CacheGuard::SetCacheHints(int, int)':
cache.cpp:745:7: error: '_RPT3' was not declared in this scope
745 | _RPT3(0, "CacheGuard::SetCacheHints called. cache=%p hint=%d (%s) frame_range=%d\n", (void*)this, cachehints, hintname.c_str(), frame_range); // P.F.
| ^~~~~
|
|
|
|
|
|
#3628 | Link | ||
|
Formerly davidh*****
![]() Join Date: Jan 2004
Posts: 2,823
|
Quote:
Code:
unresolved external symbol "public: class PFunction __cdecl AVSValue::AsFunction(void)const " (?AsFunction@AVSValue@@QEBA?AVPFunction@@XZ) ![]() Is it something to do with this? Quote:
Code:
PFunction AsFunction() const; // internal use only Last edited by wonkey_monkey; 13th February 2026 at 19:59. |
||
|
|
|
|
|
#3629 | Link | |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
Quote:
Code:
if (real_name == nullptr) {
// if name is not given, evaluate expression to get the function
eval_result = func->Evaluate(env);
if (!eval_result.IsFunction()) {
env->ThrowError(
"Script error: '%s' cannot be called. Give me a function!",
GetAVSTypeName(eval_result));
}
//auto& func = eval_result.AsFunction(); // c++ strict conformance: cannot Convert PFunction to PFunction&
const PFunction& func = eval_result.AsFunction();
real_name = func->GetLegacyName();
real_func = func->GetDefinition();
}
|
|
|
|
|
|
|
#3630 | Link |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
Meanwhile a hotfix, 8 bit Layer "add" mask problems
New build, sorry for that. Avisynth r4507 https://github.com/pinterf/AviSynthP...3.7.6pre-r4507 |
|
|
|
|
|
#3631 | Link | |
|
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,384
|
Quote:
x86-xp build Tested and working on Windows XP Professional x86 ![]() x64-win7-19.44.35221-17.14 build Tested and working on Windows Server 2008 R2 but there seems to be a problem with assembly optimizations somewhere in x64. Code:
SetMaxCPU("none")
video=LWLibavVideoSource("M2991374.mxf")
ch1=LWLibavAudioSource("M2991374.mxf", stream_index=1, fill_agaps=1)
ch2=LWLibavAudioSource("M2991374.mxf", stream_index=2, fill_agaps=1)
audio=MergeChannels(ch1, ch2)
AudioDub(video, audio)
propClearAll()
ConvertBits(16)
Limiter(min_luma=4096, max_luma=60160, min_chroma=4096, max_chroma=60160)
SinPowerResize(1024, 576)
Info()
![]() however if I remove the Code:
SetMaxCPU("none")
![]() which is clearly wrong. Even a simple SetMaxCPU("SSE2") produces the same result which means that only the C++ code is "fine" and there's a problem somewhere in the manually written intrinsics in assembly. Digging a bit further, this seems to be related to 4:2:2 10bit planar and can be reproduced with a simple: Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(10) Info() In other words: 8bit planar 4:2:2 with assembly optimizations is fine Code:
ColorBars(848, 480, pixel_type="YV16") Info() ![]() 32bit float 4:2:2 with assembly optimizations is fine Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(32) Info() ![]() 10bit, 12bit, 14bit, 16bit planar 4:2:2 with assembly optimization from SSE2 onwards are not Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(10) Info() ![]() Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(12) Info() ![]() Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(14) Info() ![]() Code:
ColorBars(848, 480, pixel_type="YV16") ConvertBits(16) Info()
Last edited by FranceBB; 15th February 2026 at 12:10. |
|
|
|
|
|
|
#3633 | Link | ||
|
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,384
|
Quote:
Jokes aside I triple checked and I can confirm that it only happens on x64 in the Windows 7 build. x86-xp isn't affected. Quote:
![]() Tomorrow morning I'll try the x64-xp build. EDIT: I did, final results: Avisynth_3.7.6_20260213_tst_r4507 - x64-xp - x64-win7-19.44.35221-17.14 8bit planar and 32bit float 4:2:2 ok. problem with 10bit, 12bit, 14bit, 16bit planar 4:2:2 when assembly optimization are used (SSE2 onwards). SetMaxCPU("none") forces C++ which is error free. - x86-xp all ok (8bit, 10bit, 12bit, 14bit, 16bit planar and 32bit float) 4:2:2. Everything is fine both with assembly optimization (tested up to SSE4.2 which is the maximum XP supports). Last edited by FranceBB; 16th February 2026 at 09:41. |
||
|
|
|
|
|
#3634 | Link |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
Thanks FranceBB, I thought, it would be super-easy, but I was not able to reproduce it.
EDIT: It's the display, which is using ConvertToRGB32() after that. EDIT: fixed, exchanged G and B in bit-depth changing YUV->RGB full scale code. Last edited by pinterf; 16th February 2026 at 15:28. |
|
|
|
|
|
#3635 | Link |
|
Registered User
Join Date: Jan 2014
Posts: 2,535
|
New build.
Avisynth r4523 https://github.com/pinterf/AviSynthP...3.7.6pre-r4523 For online documentation check https://avisynthplus.readthedocs.io/en/latest/ Actual: https://avisynthplus.readthedocs.io/...gelist376.html Code:
20260216 3.7.5.r4523 (pre 3.7.6) -------------------------------- - Fix r4504 regression YUV->RGBP bit-depth changing full-scale SSE2/AVX2 bug (exchanged G,B storage) - "Layer" YUV mul/add/subtract/lighten/darken: refactor chroma placement calculation, allowing SIMD optimization in the main frame processing - "Layer" YUV/RGBP mul/add/subtract/lighten/darken: refactor function dispatchers, add AVX2 path (LLVM/clangcl recommended) - Fix C-only vertical resampling code which added more rounding than needed (regression since pre-3.7.5 20250427) - Invert: per-plane processing for planar formats, use C even in AVX2, proper chroma inversion - New: AddAlphaPlane opacity parameter - New: ResetMask opacity parameter - rstdoc: document "opacity" in AddAlphaPlane and ResetMask - rstdoc: detail Layer "use_chroma" and opacity - Overlay "Blend": more speed, but keep accuracy, use float only where really needed - Layer: use YV16 internally for YUY2 (lessen source bloat) 20260213 3.7.5.r4507 (pre 3.7.6) -------------------------------- Fix Layer "add" 8 bit, regression in r4504 20260212 3.7.5.r4504 (pre 3.7.6) -------------------------------- * Fix: inaccurate ColorBarsHD 10+ bit values. Now they are derived from the 32-bit float RGB definitions instead of upscaling a 8 bit precalculated YUV value. Add Ramp section the lead-in-lead-out. * Fix: GreyScale + SSE2 + RGB32 + matrix="RGB" overflow. Rare usage; "RGB" matrix (Identity) uses a 1.0 coefficient which exceeds the signed 16-bit SIMD limit of 32767 at 15-bit precision. Added bounds checking to fallback to C-code for any coefficients >= 1.0 or < −1.0. * Fix: YUV->RGB limited range matrix accuracy for 10-16 bits. * Use a different rounding in matrix coefficient's integer approximation. * "ConvertToPlanarRGB": ``bits`` parameter: on-the-fly bit-depth conversions to YUV->RGB conversion. - Full range target: 8-16 bits internal calculation is in 32-bit float. - Limited range target: a quicker, bit accuracy optimized integer calculation path. * Not Fixed: Speed degradation when in-constructor GetFrame(0) (e.g. frame-property getter) is used. Disable internal Cache object creation. Does not work in complex scripts, preparation is 5-10 min instead of <1 sec. Investigation continues (Issue #476: https://github.com/AviSynth/AviSynthPlus/issues/476) * Avoid MTGuard and CacheGuard creation if filter returns one of its clip parameter unaltered. * Add some avx2 stuff to Layer and Invert * Optimization: Overlay "Blend": aarch64 NEON optimization 20260203 3.7.5.r4483 (pre 3.7.6) -------------------------------- * rst documentation update: RGBAdjust https://avisynthplus.readthedocs.io/...rs/adjust.html * rst documentation update: ColorYUV https://avisynthplus.readthedocs.io/.../coloryuv.html * optimization: add AVX2 TurnLeft/TurnRight/Turn180 (R/L: 1,5-3x speed). * optimization: ConvertBits AVX2 integer->float * optimization: ConvertToPlanarRGB(A): YUV->RGB add AVX2 (2-3x speed) * optimization: ConvertToPlanarRGB(A): YUV->RGB 16 bit: a quicker way (1,5x) * Fix: C version of 32-bit ConvertToPlanarRGB YUV->RGB to not clamp output RGB values. * ConvertToPlanarRGB(A): add bits parameter to alter target bit-depth. * ConvertToPlanarRGB(A): from YUV->RGB full range output: optimized in-process when bits=32, other cases call ConvertBits internally. * Fix: Packed RGB conversions altering the bit-depth (e.g. rgb32->ConvertToRGB64() worked always in full range. * Add more AVX512 resampler code. (WIP) * Add more AVX512_BASE code paths (Resamplers) * Build: add _avx512b.cpp/hpp pattern in CMake to detect source to compile with base (F,CD,BW,DQ,VL) flags. However AVX512_BASE itself is set only when AVX512_FAST found. For pre-Ice Lake (older AVX512) systems you can enable it with SetMaxCPU("avx512base+") and get the optimized AVX512_BASE functions. * Build: add new architecture z/Architecture |
|
|
|
|
|
#3636 | Link |
|
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,384
|
Thanks for the new build and for fixing the issue, Master Ferenc.
Avisynth_3.7.6_20260216_tst_r4523 x64-win7-19.44.35221-17.14 Windows Server 2008 R2 x64 Now 4:2:2 10bit/12bit/14bit/16bit work correctly with assembly optimizations up to SSE4.2 as well. ![]() x86-xp Windows XP Professional x86 works normally as usual
|
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|