Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#3321 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,434
|
Big
![]() Now i can build the .lib, and build the avs plugin with llvm. Because of course, building with llvm but using the .lib created with msvc didn't work. What surprised me, in a good way, is that the .lib provided by Devil SDK worked with both msvc and llvm.
__________________
My github. |
![]() |
![]() |
![]() |
#3323 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,434
|
![]() ![]() I think the "best" maybe is to be able to build everything with llvm (or the same toolchain for another compiler), but this is good to know. And this probably answer the fact that i can use the Devil lib with llvm.
__________________
My github. Last edited by jpsdr; 13th April 2025 at 10:36. |
![]() |
![]() |
![]() |
#3324 | Link |
...?
Join Date: Nov 2005
Location: Florida
Posts: 1,471
|
AviSynth+ 3.7.5 has been released. There were some things that required a hotfix (particularly on non-x86 platforms), so the list of changes from 3.7.4->3.7.5 is pretty small.
|
![]() |
![]() |
![]() |
#3327 | Link |
...?
Join Date: Nov 2005
Location: Florida
Posts: 1,471
|
AviSynth 2.5/2.6 were UPX-packed, and simply out of inertia some of my early builds of AviSynth+ were as well. But I stopped doing that literally 10 years ago, and the UPX-packed DevIL.dll that was stored in the source tree was changed to an unpacked one in 2020 because it was a hassle to remember to unpack it before building the installer (and of course, the vendored binaries/libraries don't exist in the source tree now anyway).
|
![]() |
![]() |
![]() |
#3329 | Link |
Registered User
Join Date: Jan 2014
Posts: 2,473
|
Thanks, qyot27.
Recently, I spent about ten times more time on a topic than I was originally inteded to, but it's an interesting one. Since qyot27 released an aarch64 Windows version of Avisynth+ (Arm64, specifically Arm64-v8 or v8.4a?), I thought I'd add Neon intrinsic support. (Think of Arm Neon as similar to SSE2 or AVX2 on Intel.) I read the Arm documentation, and the intrinsics are logical and easy to understand, so it wouldn't take long to convert existing Avisynth code to Neon. Since this seemed straightforward, I decided to do it using real Neon intrinsics instead of the "header" method, which translates existing Intel SIMD to more or less usable Arm SIMD code. To test on Arm, I need a cross compiler and a hardware and an OS. I have an i7-11700 with plenty of processing power and RAM, so I tried installing Win11 under QEMU. Unfortunately, the how-to guides on the Internet were made before Win11 24H2, which closed the infamous TPM2.0 registry cheat method of installation. I went back to a previous Win11 23H2, which I successfully installed under QEMU. However, it was ridiculously slow - remember, this is real emulation - so slow that I couldn't even start a command prompt. By the time the start menu or search box appeared, a timeout occurred, and the menu collapsed back. ![]() Fortunately, my sons bought a Raspberry Pi 5. One is using it for growing grapefruit seeds and monitoring soil humidity ![]() Soon, some configuration issues were discovered in the build process, which were fixed and clarified in 3.7.5 in the build environment (See qyot27's changes, fixing default Release optimization when e.g. Ninja was used). First, I decided to make our resampler code in Avisynth - the C version -more compiler-friendly (and finish some optimization tasks that were postponed in the 3.7.4 release). Vectorizing friendly C code means that even from well-written C, a good compiler would generate very quick SIMD instructions. This is what I've been doing in past weeks: modify, test, benchmark. First on Windows, then on Raspberry Pi 5 Arm64. Now I can say that MSVC is not capable of any optimization on integer values (e.g., 8-16 bit pixel values) from C code. I tried to help MSVC by writing code that would assist it, but in vain. After a week, I reached the point where I don't want to fight anymore; I cannot win the battle against the compiler's limitations and ignorance. What works in MSVC's own C++ compiler: What is already written in direct Intel SIMD code. - 32-bit float arrays. Maybe. Such operations would be realized and compiled into SIMD code. - Use MSVC when you need to support ancient OSs. Conclusion: I'd ban Windows MSVC Microsoft compiler builds for today's processors in use cases where we process integer data and no handcrafted SIMD code is not available. What to use instead? MSVC + clang-cl or LLVM is perfectly fine. Also, Intel ICX, which is LLVM-based as well. gcc on non-Windows is fine as well. For Intel/AMD64 processor code, LLVM wins over GCC (as of 2025) by seeing the code it generates (Godbolt Compiler Explorer is a great tool!). The GCC compiler is good, but it's not on LLVM's level. LLVM detects and utilizes many more tricks than any of the others, which surprised me. And now, some benchmarks. Horizontal resapling code compiler (Win64, SSE2, etc means the max processor level, the compiler can utilize), 8, 10 and 16 bit. ClangCl: what VS2022 supports out of box Intel: Intel C++ Compiler 2025 (LLVM 19, I think) Figures are in FPS(bit depth) Code:
417(8) 463 (10) 360(16) from C, MSVC SSE2 3.7.4 597(8) 631(10) 651(16) from C, Intel SSE4.2 3.7.4 1183(8) 1193(10) 889(16) from C, Intel AVX2 3.7.4 91(8) 83(10) 56(16) from C, MSVC SSE2 3.7.6 477(8) 468(10) 292(16) from C, ClangCl SSE2 3.7.6 429(8) 468(10) 1327(16)(!) from C, Intel SSE2 3.7.6 1734(8) 2545(10) 1580(16) from C, Intel SSE4.2 3.7.6 2173(8) 2620(10) 1804(16) from C, Intel AVX2 3.7.6 3800(8) from SIMD, MSVC AVX2 2870(8) from SIMD, Intel C++ 2025, SSSE3 4370(8) from SIMD, Intel C++ 2025, AVX2 - The C code can be written so as a good compiler can turn it easily to SIMD code. - MSVC own compiler: no comment - When the compiler can utilize a more modern processor instruction set (like we build it for "native" arch), speed benefits a lot. |
![]() |
![]() |
![]() |
#3330 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,320
|
It is also good to test with different frame sizes (relative to typical CPU cache size of about 2..20 MBytes ?) and multiplied to threads number as we have frame-based multithreading in current AVS+. Like with 3 different frame size:
1. Very small - buffer size much less of 2 MBytes. 2. Medium - buffer size is about 2..8 MBytes. 3. Big - buffer size is significantly more of 2..8 MBytes. Also as current resizer is dual-1D it is good to test H+V changing of frame size (like 2x width and 2x height). Big enough number of tests may be required. Most benefit of the 'good from-C' compiler expected on poorly SIMD-optimized parts of software as we have less and less human resources for SIMD optimizing and expected AI compilers may still helps somehow. |
![]() |
![]() |
![]() |
#3331 | Link |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,655
|
What (if anything) is considered the correct way to ask a user for a colourspace as a filter parameter? Expect a string and convert with ColorSpaceNameToPixelType (or it's C++ equivalent)?
Could/should pixel type numbers be exposed as script function constants? PS Has something happened to AVSMap recently? I was trying to compile my program after installing 3.7.5 and it was giving me unresolved external errors (I think, might have something different, but it was definitely linking-related), but a rebuild seem to fix it. Last edited by wonkey_monkey; Yesterday at 15:12. |
![]() |
![]() |
![]() |
#3332 | Link |
...?
Join Date: Nov 2005
Location: Florida
Posts: 1,471
|
Getting the macOS builds ready for 3.7.4 stalled so long (as a side effect of my access to an Intel version of macOS not working too well this time around, and waffling about where to draw the cutoff for the minimum OS version) that 3.7.5 ended up getting released first. It's probably just as well, since that ARM64 resizer issue that got fixed in 3.7.5 probably also affected macOS.
But I just fixed that issue by uploading the macOS build for 3.7.5. Unlike previous installers for macOS, this time, we have a Universal Binary, so there aren't separate installers for Intel vs. Apple Silicon. The minimum OS version is also set to macOS 13 Ventura, as that is the oldest version still supported by Apple. While it does limit the range of Intel Macs that can be installed to, there is always OpenCore-Legacy-Patcher or you could still build the binaries yourself (and given the differences introduced in 10.15 Catalina, if you're using anything older than that, you'd probably be better served by building it yourself). Unless something drastic has regressed lately, it's still possible to build AviSynth+ for Tiger (10.4) and Leopard (10.5) on PowerPC if you have the patience to let Tigerbrew compile much newer versions of the build tools (CMake, GCC, etc.), so... Also, between having to disable Gatekeeper to run stuff that isn't code-signed on Apple Silicon (unless, like I mentioned above, you built it yourself) and the shift toward using RPATH shenanigans, the optimal long-term solution for native macOS support for AviSynth+ as well as client applications that use it will be for a package to get added to MacPorts and Homebrew, as they can deal with that weirdness without the user necessarily having to worry about it. |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|