Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se |
|
|
#41 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
Testing ffmpeg vulkan
Everyone probably has it and knows how to create it, but let's say I'll do it my way. https://github.com/KhronosGroup/Vulk...522ff966f95b1e https://github.com/KhronosGroup/glsl...7936536249d9be https://www.sendspace.com/file/qflerl |
|
|
|
|
|
#43 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
Latest additions:
https://github.com/libass/libass/com...b18663831a8d77 https://github.com/libffi/libffi/com...1f2b304802bd00 https://github.com/PCRE2Project/pcre...2f4d7357f3c431 https://code.videolan.org/videolan/d...b12dc8be6bee2f https://github.com/ultravideo/kvazaa...b311678419d659 https://ftp.gnu.org/gnu/libiconv/libiconv-1.18.tar.gz https://ftp.gnu.org/gnu/gettext/gettext-0.25.tar.gz https://sourceware.org/pub/valgrind/...3.25.1.tar.bz2 // modification glib https://gitlab.gnome.org/GNOME/libxm...ca9dcf7654ac5e https://gitlab.gnome.org/GNOME/pango...ed9ae0ea19168f https://gitlab.gnome.org/GNOME/glib/...3ceb472034a9b9 https://gitlab.gnome.org/GNOME/gdk-p...3b7c028ee052a4 https://gitlab.freedesktop.org/cairo...476e0e93b92e65 https://gitlab.freedesktop.org/freet...2204e5c91c5226 https://gitlab.freedesktop.org/pixma...c3e8ef6cc94d43 https://github.com/harfbuzz/harfbuzz...d995e57ffc1d24 https://github.com/libjxl/libjxl/com...971cdfa51cec12 https://github.com/webmproject/libvp...f0ad76560e3a0d https://github.com/fraunhoferhhi/vve...a33a7dbaf60b70 https://github.com/KhronosGroup/Open...083db3d651d55d https://github.com/KhronosGroup/Open...f05bf723ecb166 https://github.com/AcademySoftwareFo...2b975be0748b9c https://github.com/xiph/opus/commit/...120dda4e47c591 https://github.com/xiph/ogg/commit/5...83b20260c8f03f https://github.com/xiph/speex/commit...ff3585ccbb86c1 https://bitbucket.org/multicoreware/...ec9d5f21d14afe https://github.com/KhronosGroup/Vulk...6ea9ce2d2c336b https://github.com/KhronosGroup/glsl...6d66324f7cd383 https://github.com/dlbeer/quirc/comm...eb374b438ff8f2 https://github.com/HomeOfAviSynthPlu...aa197b9fb82fda https://github.com/opencv/opencv/com...baf028b66eaad7 https://github.com/v-novaltd/LCEVCde...4addc40d65bd9b https://github.com/webmproject/libwe...535baa09055d30 Testing opencv. This is software with limited features in ffmpeg and is generally of little use. Are opencl options enabled? ffmpeg_avx2.exe -v verbose -hwaccel opencl -init_hw_device opencl=gpu:0.0 -filter_hw_device gpu -i "input.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=dilate:3x3+0x0/rect|1,format=yuv420p" -frames:v 1000 output_x264.mkv ffmpeg_avx2.exe -v verbose -hwaccel opencl -init_hw_device opencl=gpu:0.0 -filter_hw_device gpu -i "input.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=erode:3x3+0x0/rect|1,format=yuv420p" -frames:v 1000 output_x264.mkv ffmpeg_avx2.exe -v verbose -hwaccel opencl -init_hw_device opencl=gpu:0.0 -filter_hw_device gpu -i "input.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=smooth:gaussian|3|0|0|0,format=yuv420p" -frames:v 1000 output_x264.mkv ffmpeg_avx2.exe -v verbose -hwaccel opencl -init_hw_device opencl=gpu:0.0 -filter_hw_device gpu -i "input.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=dilate:5x5+2x2/cross|2,format=yuv420p" -frames:v 1000 output_x264.mkv ffmpeg_avx2.exe -v verbose -hwaccel opencl -init_hw_device opencl=gpu:0.0 -filter_hw_device gpu -i "input.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=dilate:0x0+2x2/custom=diamond.shape|2,format=yuv420p" -frames:v 1000 output_x264.mkv [Parsed_ocv_1 @ 00000220b480ee00] [FILE @ 000000dec17fd900] Cannot read file 'diamond.shape': No such file or directory Problem with mingw 13.0.0 & avisynth. I use mingw 12.0.0 https://github.com/msys2/MINGW-packages/issues/24532 open source https://www.sendspace.com/file/rhw0z4 https://www.sendspace.com/file/t2x02t Last edited by Jamaika; 26th June 2025 at 16:01. |
|
|
|
|
|
#45 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
GCC 15.1.0-Rev7 & MINGW64 13.0.0-r72
https://github.com/KhronosGroup/Vulk...1921aa9244808b https://github.com/libass/libass/com...0df82f07d71a34 https://github.com/libjxl/libjxl/com...a293b72e32260c https://github.com/harfbuzz/harfbuzz...18c5c62039d655 https://gitlab.freedesktop.org/cairo...cf6295d1ff947d https://github.com/xiph/opus/commit/...2180c25385c569 https://github.com/AcademySoftwareFo...2396a9531afbab https://gitlab.gnome.org/GNOME/libxm...1a54bade9e5e72 https://gitlab.gnome.org/GNOME/glib/...e4f68f41630791 https://github.com/fraunhoferhhi/vve...5beaff090a5c58 https://www.sendspace.com/file/0nbr03 Last edited by Jamaika; 22nd July 2025 at 07:53. |
|
|
|
|
|
#47 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
Latest additions:
https://gitlab.freedesktop.org/freet...a6811db06bd48e https://gitlab.gnome.org/GNOME/pango...2a40cd4961d4a4 https://gitlab.gnome.org/GNOME/glib/...d804dbea3003da https://gitlab.gnome.org/GNOME/libxm...b15471667cf446 https://gitlab.gnome.org/GNOME/gdk-p...ca3010f57c7be0 https://github.com/harfbuzz/harfbuzz...074698d778e6c8 https://github.com/PCRE2Project/pcre...ac730d551a0fce https://github.com/libffi/libffi/com...562677214084fd https://github.com/AcademySoftwareFo...95aeda1c147bfa https://github.com/fraunhoferhhi/vve...bf57359dc8db19 https://github.com/libjxl/libjxl/com...78db935ea9d43f https://github.com/google/brotli/com...803547a66cdfc2 https://github.com/webmproject/libwe...0deaef3064c9dd https://github.com/KhronosGroup/glsl...3abf6c1103fc59 https://github.com/KhronosGroup/Vulk...b0d380c0378639 https://github.com/KhronosGroup/Vulk...6443c6c24a6962 https://github.com/KhronosGroup/Open...8aca94e938fd5b https://github.com/ggml-org/whisper....02be716b3e5861 https://github.com/mingw-w64/mingw-w...17d4fc55714c5d Code:
whisper.cpp: In function 'whisper_state* whisper_init_state(whisper_context*)':
whisper.cpp:3462:26: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'long long unsigned int' [-Wformat=]
3462 | WHISPER_LOG_INFO("%s: alignment heads masks size = %ld B\n", __func__, memory_size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~
| |
| long long unsigned int
whisper.cpp:121:75: note: in definition of macro 'WHISPER_LOG_INFO'
121 | #define WHISPER_LOG_INFO(...) whisper_log_internal(GGML_LOG_LEVEL_INFO , __VA_ARGS__)
| ^~~~~~~~~~~
whisper.cpp:3462:62: note: format string is defined here
3462 | WHISPER_LOG_INFO("%s: alignment heads masks size = %ld B\n", __func__, memory_size);
| ~~^
| |
| long int
| %lld
Code:
af_whisper.c: In function 'init':
af_whisper.c:153:69: warning: format '%ld' expects argument of type 'long int', but argument 6 has type 'int64_t' {aka long long int'} [-Wformat=]
153 | "Whisper filter initialized: model: %s lang: %s queue: %ld ms\n",
| ~~^
| |
| long int
| %lld
154 | wctx->model_path, wctx->language, wctx->queue / 1000);
| ~~~~~~~~~~~~~~~~~~
| |
| int64_t {aka long long int
https://www.sendspace.com/file/6f9oep ffmpeg_avx2 20250809 https://www.sendspace.com/file/4y3dju GCC 15.2.0-Rev8 & MINGW64 13.0.0-r107 & gettext 0.26 & binutils 2.45 r2 ffmpeg_avx2 20250810 <-added GGML_USE_CPU, BROTLI_STATIC_INIT=2 https://www.sendspace.com/file/edl11g GCC 15.2.0-Rev8 & MINGW64 13.0.0-r107 & gettext 0.26 & binutils 2.45 r2 & nasm 2.17rc0 https://www.sendspace.com/file/2bfdqx Next time: https://github.com/netwide-assembler...33b38508bdc9d3 https://gitlab.freedesktop.org/freet...c563650687bd43 https://github.com/harfbuzz/harfbuzz...13c0d0ade48e47 https://gitlab.gnome.org/GNOME/libxm...5d23bcf28a45b7 https://gitlab.gnome.org/GNOME/gdk-p...2e4e285374ec1d https://github.com/KhronosGroup/Vulk...0664d0b25d3a24 https://github.com/KhronosGroup/glsl...bf752c57b35b30 https://github.com/webmproject/libwe...c6d8abe7c791a8 https://github.com/google/brotli/com...afe168bd231161 https://github.com/m-ab-s/aom/commit...3a38422265b5ec https://github.com/opencv/opencv/com...042a378c658634 https://github.com/xiph/opus/commit/...2f8f024444b05f https://github.com/pytorch/cpuinfo/c...3945502a30a4ae https://github.com/simd-everywhere/s...b1f691e46209c3 https://github.com/google/highway/co...3af01911654c88 https://github.com/nlohmann/json/com...b644a38ee273ff ffmpeg_avx2 8.0.0 20250815 <-added GGML_USE_OPENCL https://www.sendspace.com/file/py2hsp ffmpeg_avx2 8.0.0 20250816 <-added GGML_USE_OPENCL testing GGML_USE_OPENCL [Parsed_whisper_0 @ 0000023a29282a00] Unsupported GPU: NVIDIA GeForce RTX 3050 ffmpeg_avx2 -i "input.mp4" -vn -af "whisper=model=for-tests-ggml-base.en.bin:language=en:queue=3:use_gpu=true:gpu_device=0:destination=output.srt:format=srt" -f null - https://github.com/ggml-org/whisper.cpp/issues/3077 testing ffv1_vulkan ffmpeg_avx2 -init_hw_device vulkan -i "input.mp4" -vf "hwupload=derive_device=vulkan,format=vulkan" -c:v ffv1_vulkan -level 4 -strict experimental -slicecrc 0 -c:a copy "sample.mkv" ffmpeg currently doesn't include GGML_USE_VULKAN, OPENCV5_VULKAN. OPENCV5_OPENCL isn't integrated. testing opencv5 ffmpeg_avx2.exe -v verbose -i "inpus.mp4" -y -c:v libx264 -vb 3000k -c:a aac -ac 2 -ar 48000 -ab 128k -x264-params opencl=true:threads=4 -vf "scale=1920:1080,ocv=filter_name=dilate:filter_params=5x5+2x2/cross|2,format=yuv420p" -frames:v 1000 output_x264.mkv https://www.sendspace.com/file/w14fad Starting with CUDA 13.0, the Windows display driver is no longer bundled with the CUDA Toolkit package. Users must download and install the appropriate NVIDIA driver separately from the official driver download page. link no longer has any meaning: https://gitlab.com/nvidia/headers/cu...ividual/cudart Original git for ffmpeg: https://code.ffmpeg.org/FFmpeg/FFmpeg https://www.sendspace.com/file/fawjor Last edited by Jamaika; 17th August 2025 at 15:31. |
|
|
|
|
|
#49 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
Holiday topic. Is it possible to build CUDA 13.0.0 using gcc 15.2.0 and mingw 13.0.0?
I know that everyone prefers a ready-made and perfectly working product than wasted time. There are many challenges but success is not far off. The whisper_cuda_x64.a library I created even looks good. Maybe in 5 or 10 years it will be a reality. https://www.sendspace.com/file/pesbcy Strange messages: GCC 15.2.0-Rev8 & MINGW64 13.0.0-r124 & binutils 2.45 r2 g++.exe -std=gnu++17 -static -g0 -O3 -march=x86-64-v3 -mtune=generic -mthreads -mavx2 -mbmi -mbmi2 -mlzcnt -mfma -mmovbe -mhle -D__CUDA_ARCH__=720 -x c++ -D__CUDACC__ -D__NVCC__ -DN__CUDACC_RTC__ -D__CUDACC_VER_MAJOR__=13 -D__CUDACC_VER_MINOR__=0 -D__CUDACC_VER_BUILD__=48 -DWHISPER_VERSION="1.7.6-4245c77" -DGGML_USE_CPU=1 -DGGML_USE_OPENCL=1 -DGGML_USE_CUDA=1 -UGGML_OPENCL_SOA_Q -DGGML_OPENCL_PROFILING=1 -DGGML_OPENCL_TARGET_VERSION=300 -D__CUDA_API_VERSION_INTERNAL=1 -D__CUDA_INTERNAL_SKIP_CPP_HEADERS__=1 -D_CRTBLD=1 -D__CORRECT_ISO_CPP_MATH_H_PROTO=1 -D_GLIBCXX_MATH_H=1 -D_GLIBCXX_USE_C99_DYNAMIC=1 -D_CCCL_NO_DEDUCTION_GUIDES=1 -D_CCCL_BUILTIN_SIGNBIT=1 -include "cudart/cuda_runtime.h" -c "binbcast.cu" -o "binbcast.o" Code:
C:/gcc1520/x86_64-w64-mingw32/include/cudart/crt/math_functions.h:5335:21: error: '__promote_2' in namespace '__gnu_cxx' does not name a template type
5335 | typename __gnu_cxx::__promote_2<_Tp, _Up>::__type pow(_Tp, _Up);
| ^~~~~~~~~~~
C:/gcc1520/x86_64-w64-mingw32/include/cudart/crt/math_functions.h:5335:32: error: expected unqualified-id before '<' token
5335 | typename __gnu_cxx::__promote_2<_Tp, _Up>::__type pow(_Tp, _Up);
| ^
Code:
fattn-common.cuh:529:38: error: there are no arguments to 'isinf' that depend on a template parameter, so a declaration of 'isinf' must be available [-Wtemplate-body]
529 | all_inf = all_inf && int(isinf(tmp.x)) && int(isinf(tmp.y));
| ^~~~~
fattn-common.cuh:529:38: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
Code:
fattn-common.cuh:678:5: error: there are no arguments to '__builtin_assume' that depend on a template parameter, so a declaration of '__builtin_assume' must be available [-Wtemplate-body]
678 | __builtin_assume(tid < D);
| ^~~~~~~~~~~~~~~~
Code:
fattn-mma-f16.cuh:1250:25: error: expected ')' before '*' token
1250 | __launch_bounds__(nwarps*WARP_SIZE, 1)
| ^
fattn-mma-f16.cuh:1250:1: note: to match this '('
1250 | __launch_bounds__(nwarps*WARP_SIZE, 1)
| ^~~~~~~~~~~~~~~~~
GNU assembler is big problem. GNUC doesn`t accept `const float`, but interestingly it accepts `const double` Code:
__CUDA_HOSTDEVICE_FP16_DECL__ __half __float2half(const float a)
{
__half val;
NV_IF_ELSE_TARGET(NV_IS_DEVICE,
asm("{ cvt.rn.f16.f32 %0, %1;}\n" : "=h"(__HALF_TO_US(val)) : "f"(a));
,
__half_raw r;
...
}
Code:
In function '__half __double2half(double)',
inlined from '__half::__half(float)' at C:/gcc1520/x86_64-w64-mingw32/include/cudart/cuda_fp16.h:4697:69,
inlined from 'void __static_initialization_and_destruction_0()' at ggml-cuda.cu:1796:38,
inlined from '(static initializers for ggml-cuda.cu)' at ggml-cuda.cu:3792:1:
C:/gcc1520/x86_64-w64-mingw32/include/cudart/cuda_fp16.hpp:551:1: error: impossible constraint in 'asm'
511 | NV_IF_ELSE_TARGET(NV_IS_DEVICE,
| ^~~~~~~~~~~~~~~~~~~
https://www.sendspace.com/file/uykxkc Last edited by Jamaika; 24th August 2025 at 13:54. |
|
|
|
|
|
#50 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
New ffmpeg 8.0.0 whatever that means
GCC 15.2.0-Rev8 & MINGW64 13.0.0-r124 & gettext 0.26 & binutils 2.45 r2 https://gitlab.gnome.org/GNOME/pango...82aac01b160115 https://gitlab.gnome.org/GNOME/libxm...7dcec1acbf62f6 https://gitlab.gnome.org/GNOME/glib/...91ccc69e04d2b6 https://github.com/harfbuzz/harfbuzz...65ad7c05f95907 https://gitlab.freedesktop.org/freet...9e139fd489cb9e https://github.com/KhronosGroup/Vulk...a2fab189ed76b6 https://github.com/KhronosGroup/glsl...9853ae9be52bf1 https://github.com/libjxl/libjxl/com...dbebadb878a58f https://github.com/google/brotli/com...5d7c30638ae132 https://github.com/mm2/Little-CMS/co...f0b03f0ad5552b https://github.com/webmproject/libwe...20de0b5a962eb4 https://www.sendspace.com/file/2weyc5 |
|
|
|
|
|
#51 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
https://github.com/webmproject/libwe...4b514ad6dcda56
https://github.com/KhronosGroup/glsl...d3ab2a6df031e5 https://github.com/KhronosGroup/Vulk...40dbdea47d8989 https://github.com/fraunhoferhhi/vve...e02655ca27bf4f https://github.com/mm2/Little-CMS/co...6ec96c209769bd https://github.com/PCRE2Project/pcre...75bb84e7f79370 https://github.com/freetype/freetype...03f209026b2f05 https://github.com/harfbuzz/harfbuzz...a0d8f1ce1b8423 https://gitlab.gnome.org/GNOME/gdk-p...a467a09034b54e https://gitlab.gnome.org/GNOME/glib/...bd41b6b1d4758e https://github.com/google/brotli/com...e82c5081fde0bb https://codeberg.org/StvG/avsresize/...6f1b7b82f6f993 https://www.sendspace.com/file/ivic37 next additions: https://github.com/AcademySoftwareFo...f0551d2d8cf911 https://gitlab.gnome.org/GNOME/glib/...f45a7f01666800 https://gitlab.freedesktop.org/freet...21f0f07d22ae68 https://github.com/harfbuzz/harfbuzz...87f44477f4a220 https://github.com/PCRE2Project/pcre...dc1f3b5fc752bb https://github.com/google/brotli/com...f3d1930fb8983d https://github.com/libjxl/libjxl/com...4a6e4d16ac0953 https://github.com/google/brotli/com...f3d1930fb8983d https://github.com/AviSynth/AviSynth...93a1a09d501f8e https://github.com/mingw-w64/mingw-w...3a4f5ae74fdc89 [ffplay_buffersink @ 000001ad18f06460] The "alpha_modes" option is deprecated: set the supported alpha modes https://www.sendspace.com/file/x5dudz next additions: https://github.com/xiph/opus/commit/...a46cc417bfb217 https://github.com/m-ab-s/aom/commit...f7b2f94d9b5e6b https://github.com/libjxl/libjxl/com...f90db4f3a84fe9 https://github.com/google/brotli/com...c59c6970a9065c https://github.com/KhronosGroup/glsl...401e57d8df56b3 https://github.com/PCRE2Project/pcre...f2eb011037c4af https://gitlab.gnome.org/GNOME/glib/...108c50ad6d785c I don't know where to download the latest nasm? https://github.com/netwide-assembler...e6a47831b8c398 My problems with creating ffmpeg. It turns out these are just my problems with GCC/mingw64. Generally I don't recommend logging into ffmpeg. https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20408/files https://www.sendspace.com/file/bq6pn9 next additions: https://gitlab.gnome.org/GNOME/gdk-p...e9e4533a9ee55a https://gitlab.gnome.org/GNOME/glib/...2ca8963e9d0a9b https://gitlab.gnome.org/GNOME/libxm...aa770a3b8d8438 https://gitlab.freedesktop.org/freet...b2f839f6f5680b https://github.com/harfbuzz/harfbuzz...eb76b22b1800c8 https://github.com/google/brotli/com...6231d2c21933df https://github.com/libjxl/libjxl/com...794ecb73f518eb https://github.com/AcademySoftwareFo...f7d1e3398c6419 https://www.sendspace.com/file/envmgd From an amateur's perspective. Comparison of the C++ GNU/Clang/Cuda systems. Initially, I thought that GNU assembler didn't work because it was written differently. The notation is correct, but it still doesn't work on NVIDIA processors. It seems to be heavily tied to assembler and doesn't like '>>' characters. CUDA doesn't support ucrt functions. We can convert assembler to C++ using AI for GNU but probably won't work with CUDA C++. https://www.codeconvert.ai/assembly-to-c-converter for <math.h> Code:
__SM_61_INTRINSICS_DECL__ int __dp4a(int srcA, int srcB, int c) {
int ret;
asm volatile ("dp4a.s32.s32 %0, %1, %2, %3;" : "=r"(ret) : "r"(srcA), "r"(srcB), "r"(c));
return ret;
}
Code:
__SM_61_INTRINSICS_DECL__ int __dp4a(int srcA, int srcB, int c) {
int ret;
ret = srcA * srcB + c;
return ret;
}
https://github.com/NVIDIA/cccl https://developer.nvidia.com/cuda-do...type=exe_local It is recommended to use C++23 in CUDA. And this is the problem with ffmpeg and GNUC. Too modern. CCCL uses cmath which is incompatible with GCC 15.2.0. You have to use tricks, e.g. #undef fpclassify(). What else might surprise the user? After adding the entire cccl system, it turns out that there is no support for the func<<<...>>>() function. The cccl system also doesn't add additional attribute functions. https://gcc.gnu.org/onlinedocs/gcc-1...ion-Attributes Code:
#elif _CCCL_COMPILER(GCC, >=, 13) && (__cplusplus >= 202302L)
# define _CCCL_BUILTIN_ASSUME(...) \
NV_IF_ELSE_TARGET(NV_IS_DEVICE, (__builtin_assume(__VA_ARGS__);), (__attribute__((assume(__VA_ARGS__)));))
Code:
#if (__GNUC__ >= 15) #define __grid_constant__ #define __host__ #define __device__ #define __global__ #define __tile_global__ #define __tile__ #define __tile_builtin__ #define __shared__ __attribute__((shared)) #define __constant__ #define __managed__ #define __nv_pure__ #define __launch_bounds__(...) #endif #endif Code:
#ifdef __CUDA__ #include <features.h> /* for __THROW */ #elif __cplusplus #define __THROW throw() #elif __GNUC__ #define __THROW __attribute__((nothrow)) #endif Open source CUDA 13.0.1 Header: https://www.sendspace.com/file/t07oo0 Last edited by Jamaika; 11th January 2026 at 12:28. |
|
|
|
|
|
#54 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
Testing the latest jpeg libraries. Compatibility with the ffmpeg converter.
There are many variations of the JPEG lossless codec's functions and design. Here, I'm simply listing a free one with no features. jpegLS_avx2.exe -encodepnm image_21447_24bit.ppm image_21448_24bit(l)_YCbCr.jls Stream #0:0: Video: jpegls, rgb24(bt470bg/unknown/unknown), 4000x3000, 25 fps, 25 tbr, 25 tbn The jpegxl codec has many functions, some of which are hidden. It's currently better suited to ffmpeg. Strange thing is the latest JPEGXL codec doesn't support AVX2. cjxl_avx2.exe image_21447_24bit.ppm image_21447_24bit_ppm.jxl -v -j 1 -m 1 -q 100 -e 5 -C 0 --num_threads=4 -x color_space=RGB_D65_202_Rel_PeQ Stream #0:0: Video: jpegxl (libjxl), rgb24(pc, gbr/bt2020/smpte2084), 4000x3000, 25 fps, 25 tbr, 25 tbn HTJPEG2000 codec family. There are no free HTMJPEG2000 codecs here. The multi-channel feature is underdeveloped. OpenHTJ2K only supports yuv444p and has no other additional features. Free features are presented here. OpenHTJ2K codec has fewer features and doesn't support RAW. FFmpeg doesn't like to display RGB24. I don't know how it displays YCoCg. cjph_avx2.exe -i image_21447_24bit_yuv444p10le.yuv -o image_21447_24bit_jph.j2c -dims {4000,3000} -num_comps 3 -signed false -bit_depth 10 -downsamp {1,1},{1,1},{1,1} -block_size {64,64} -precincts {128,128},{256,256} -prog_order CPRL -reversible true Stream #0:0: Video: jpeg2000, rgb48le(10 bpc), 4000x3000, 25 fps, 25 tbr, 25 tbn cjhc_avx2.exe -i image_21447_24bit.ppm -o image_21447_24bit_jhc.j2c Stiles={4000,3000} Clevels=3 Cblk={64,64} Cprecincts={128,128},{256,256} Corder=CPRL Creversible=yes -num_threads 4 Stream #0:0: Video: jpeg2000, rgb24, 4000x3000, 25 fps, 25 tbr, 25 tbn JPEGXT has turned out to be perhaps the most underwhelming of the JPEG family. It's been relegated to the dustbin of history since I started creating codecs in 2017. It has lossless features and other features but arithmetic coding is incompatible with FFMPEG. What is mjpeg? It isn't Motion JPEG. Stream #0:0: Video: mjpeg, none(bt470bg/unknown/unknown), 25 fps, 25 tbr, 25 tbn [mjpeg @ 000002687DFD0C40] mjpeg: unsupported coding type (c9) [mjpeg @ 000002687DFD0C40] Can not process SOS before SOF, skipping [mjpeg @ 000002687DFD0C40] Found EOI before any SOF, ignoring [mjpeg @ 000002687DFD0C40] No JPEG data found in image jpegXT_avx2.exe -q 100 -l -c -a -r -qt 3 -s 1x1,1x1,1x1 image_21447_24bit.ppm image_21448_24bit(l)_RGB.jxt Stream #0:0: Video: mjpeg (Sequential), gbrp(bt470bg/unknown/unknown), 4000x3000 [SAR 96:96 DAR 4:3], 25 fps, 25 tbr, 25 tbn jpegXT_avx2.exe -q 100 -v -c -a -r -qt 3 -s 1x1,1x1,1x1 image_21447_24bit.ppm image_21448_24bit(p)_RGB.jxt Stream #0:0: Video: mjpeg (Progressive), gbrp(bt470bg/unknown/unknown), 4000x3000, 25 fps, 25 tbr, 25 tbn jpegXT_avx2.exe -q 100 -v -a -r -qt 3 -s 1x1,1x1,1x1 image_21447_24bit.ppm image_21448_24bit(p)_YCbCr.jxt Stream #0:0: Video: mjpeg (Progressive), yuvj420p(pc, bt470bg/unknown/unknown), 4000x3000 [SAR 96:96 DAR 4:3], 25 fps, 25 tbr, 25 tbn jpegXT_avx2.exe -q 100 -c -a -r -qt 3 -s 1x1,1x1,1x1 image_21447_24bit.ppm image_21448_24bit(s)_RGB.jxt Stream #0:0: Video: mjpeg (Sequential), yuvj444p(pc, bt470bg/unknown/unknown), 4000x3000 [SAR 96:96 DAR 4:3], 25 fps, 25 tbr, 25 tbn jpegXT_avx2.exe -q 100 -a -r12 -qt 3 -s 1x1,2x2,2x2 image_21447_24bit.ppm image_21448_24bit(s)_YCbCr.jxt Stream #0:0: Video: mjpeg (Sequential), yuvj420p(pc, bt470bg/unknown/unknown), 4000x3000 [SAR 96:96 DAR 4:3], 25 fps, 25 tbr, 25 tbn There are various options for the 12-bit JPEG XT container. Here for some unknown reason the 12-bit JPEG is in 8-bit container. https://github.com/osamu620/OpenHTJ2...9a6dbdefb536bd https://github.com/team-charls/charl...e04e0e7417b710 https://github.com/aous72/OpenJPH/co...60386dda66db28 https://github.com/thorfdbg/libjpeg/...f885bb3e873242 https://github.com/libjxl/libjxl/com...fad7630964c38d From the news. I couldn't create the latest multi JPEGXS vfw codec for AVI. I don't know how it works. https://github.com/vasilich-tregub/VfWcodecs https://www.sendspace.com/file/6t92rm Edit: JPEG compatibility with ffmpeg looks much better but four years have passed since my last test. Last edited by Jamaika; 18th October 2025 at 08:27. |
|
|
|
|
|
#56 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
A few words about using the _CRTBLD definition in mingw within the ucrt. This is generally not allowed, but CUDA, for example, only has a CRT. Is it possible to create anything? It is possible, but... Mingw versions differ, and not all of them work. GCC1520 doesn't tolerate adding the __xxx function from lib files(e.g. nvrtc_static.lib ), even though it has static in its name. There is no definition for the Nvidia1302 thread function <<<...>>>.
Is it possible to integrate such a project? It is possible, but I don't recommend using it at this time. It needs further refinement. https://www.sendspace.com/file/ncpa9n Missing features under GCC1520. The following should be in assembler, preferably A&AT under GCC. https://github.com/mingw-w64/mingw-w...10da931a927009 Code:
#include <math.h>
#include <setjmp.h>
#include <stdint.h>
#include <stdlib.h>
#include <limits.h>
#define NOGDI
#define NOUSER
#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#define __host__
#define __device__
enum cudaRoundMode
{
cudaRoundNearest,
cudaRoundZero,
cudaRoundPosInf,
cudaRoundMinInf
};
struct uint3
{
unsigned int x, y, z;
};
struct dim3
{
unsigned int x, y, z;
#if defined(__cplusplus)
#if __cplusplus >= 201103L || ( defined(_MSC_VER) && _MSC_VER >= 1900 )
__host__ __device__ constexpr dim3(unsigned int vx = 1, unsigned int vy = 1, unsigned int vz = 1) : x(vx), y(vy), z(vz) {}
__host__ __device__ constexpr dim3(uint3 v) : x(v.x), y(v.y), z(v.z) {}
__host__ __device__ constexpr operator uint3(void) const { return uint3{x, y, z}; }
#else
__host__ __device__ dim3(unsigned int vx = 1, unsigned int vy = 1, unsigned int vz = 1) : x(vx), y(vy), z(vz) {}
__host__ __device__ dim3(uint3 v) : x(v.x), y(v.y), z(v.z) {}
__host__ __device__ operator uint3(void) const { uint3 t; t.x = x; t.y = y; t.z = z; return t; }
#endif
#endif
};
typedef struct dim3 dim3;
typedef struct uint3 uint3;
typedef struct Thread_st{
struct Thread_st *cdr;
dim3 idx;
jmp_buf ctx;
}Thread;
Thread *_thread0,*_thread1;
dim3 gridDim;
dim3 blockDim;
dim3 threadIdx;
dim3 blockIdx;
void __syncthreads(void){
Thread *th = _thread1;
if (!setjmp(th->ctx)){
th=th->cdr;
if (!th){
th = _thread0;
}
_thread1 = th;
threadIdx = th->idx;
longjmp(th->ctx, 1); // Dispatch
}
}
int __syncthreads_or(int predicate) {return predicate;}
__attribute__((noreturn)) void __trap(void)
{
__builtin_trap();
}
static uint8_t rand8()
{
return (rand() & 0xff);
}
static uint16_t rand16()
{
return rand8() << 8 | rand8();
}
static uint32_t rand24()
{
return rand16() << 8 | rand8();
}
static uint32_t rand32()
{
return rand24() << 8 | rand8();
}
static uint64_t rand64()
{
return (uint64_t)rand32() << 32 | rand32();
}
uint64_t __security_cookie;
void __cdecl __security_init_cookie() {
// maybe use a cooler random number generator
__security_cookie = rand64();
}
void __cdecl __security_check_cookie(uint64_t retrieved) {
if(__security_cookie != retrieved) {
char buf[] = "Buffer overrun detected!\n";
HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);
WriteFile(stdout, buf, sizeof buf, NULL, NULL);
// TODO: abort-like behaviour here
ExitProcess(1);
}
}
void __GSHandlerCheck() {}
int min(const int a, const int b)
{
return a < b ? a : b;
}
int max(const int a, const int b)
{
return a > b ? a : b;
}
unsigned int umin(const unsigned int a, const unsigned int b)
{
return a < b ? a : b;
}
/*unsigned int umax(const unsigned int a, const unsigned int b)
{
return a > b ? a : b;
}*/
long long int llmin(const long long int a, const long long int b)
{
return a < b ? a : b;
}
long long int llmax(const long long int a, const long long int b)
{
return a > b ? a : b;
}
/*unsigned long long int ullmax(const unsigned long long int a, const unsigned long long int b)
{
return a > b ? a : b;
}*/
float rsqrtf(float x)
{
return 1.0f / sqrtf(x);
}
float __int_as_float(int a)
{
union {int a; float b;} u;
u.a = a;
return u.b;
}
int __float_as_int(float a)
{
union {float a; int b;} u;
u.a = a;
return u.b;
}
unsigned __float_as_uint(float a)
{
union {float a; unsigned b;} u;
u.a = a;
return u.b;
}
static long long int __internal_float2ll_kernel(float a, long long int max, long long int min, long long int nan, enum cudaRoundMode rndMode)
{
unsigned long long int res, t = 0ULL;
int shift;
unsigned int ia;
if (sizeof(a) == sizeof(double) && _isnan((double)a)) return nan;
if (sizeof(a) == sizeof(float) && _isnanf((float)a)) return nan;
if (a >= max) return max; if (a <= min) return min;
ia = __float_as_int(a);
shift = 189 - ((ia >> 23) & 0xff);
res = (unsigned long long int)(((ia << 8) | 0x80000000) >> 1) << 32;
if (shift >= 64) {
t = res;
res = 0;
} else if (shift) {
t = res << (64 - shift);
res = res >> shift;
}
if (rndMode == cudaRoundNearest && (long long int)t < 0LL) {
res += t == 0x8000000000000000ULL ? res & 1ULL : 1ULL;
}
else if (rndMode == cudaRoundMinInf && t != 0ULL && ia > 0x80000000) {
res++;
}
else if (rndMode == cudaRoundPosInf && t != 0ULL && (int)ia > 0) {
res++;
}
if ((int)ia < 0) res = (unsigned long long int)-(long long int)res;
return (long long int)res;
}
static int __internal_float2int(float a, enum cudaRoundMode rndMode)
{
return (int)__internal_float2ll_kernel(a, 2147483647LL, -2147483648LL, 0LL, rndMode);
}
int __float2int_rn(float a)
{
return __internal_float2int(a, cudaRoundNearest);
}
#define native_recip(x) (x)
float __frcp_rn(float x) { return native_recip(x); }
int __double2loint(double x)
{
union {double x; struct {int lo; int hi;};} u;
u.x = x;
return u.lo;
}
int __double2hiint(double x)
{
union {double x; struct {int lo; int hi;};} u;
u.x=x;
return u.hi;
}
double __hiloint2double(int hi, int lo)
{
union {double x; struct {int lo; int hi;};} u;
u.hi=hi;
u.lo=lo;
return u.x;
}
unsigned int __umulhi(unsigned int __a, unsigned int __b)
{
uint32_t *a = (uint32_t *)&__a;
uint32_t *b = (uint32_t *)&__b;
unsigned long long diff = (unsigned long long)a * (unsigned long long)b;
return *(unsigned int *)(diff >> 32);
}
unsigned int __vsubss4(unsigned int __a, unsigned int __b) {
int32_t *a = (int32_t *)&__a;
int32_t *b = (int32_t *)&__b;
int32_t result[4];
for (int i = 0; i < 4; i++) {
long long diff = (long long)a[i] - (long long)b[i];
if (diff > INT32_MAX) {
result[i] = INT32_MAX;
} else if (diff < INT32_MIN) {
result[i] = INT32_MIN;
} else {
result[i] = (int32_t)diff;
}
}
return *(unsigned int *)result;
}
unsigned int __vsub4(unsigned int __a, unsigned int __b) {
uint32_t *a = (uint32_t *)&__a;
uint32_t *b = (uint32_t *)&__b;
uint32_t result[4];
for (int i = 0; i < 4; i++) {
unsigned long long diff = (unsigned long long)a[i] - (unsigned long long)b[i];
if (diff > UINT32_MAX) {
result[i] = UINT32_MAX;
} else {
result[i] = (uint32_t)diff;
}
}
return *(unsigned int *)result;
}
unsigned int __bool2mask(unsigned int __a, int shift) {
return (__a << shift) - __a;
}
unsigned int __vsetne4(unsigned int __a, unsigned int __b) {
return (__a != __b) ? 0xFFFFFFFF : 0x00000000;
}
unsigned int __vcmpne4(unsigned int __a, unsigned int __b) {
return __bool2mask(__vsetne4(__a, __b), 8);
}
unsigned int __byte_perm(unsigned int __a, unsigned int __b,
unsigned int __s) {
unsigned int res;
res =
((((uint64_t)__b << 32 | __a) >> (__s & 0x7) * 8) & 0xff) |
(((((uint64_t)__b << 32 | __a) >> ((__s >> 4) & 0x7) * 8) & 0xff) << 8) |
(((((uint64_t)__b << 32 | __a) >> ((__s >> 8) & 0x7) * 8) & 0xff) << 16) |
(((((uint64_t)__b << 32 | __a) >> ((__s >> 12) & 0x7) * 8) & 0xff) << 24);
return res;
}
int __iAtomicAdd(int *__p, int __v) {}
https://github.com/harfbuzz/harfbuzz...88a32327058cea https://github.com/JimmyLefevre/kb/c...649a035737359c https://gitlab.gnome.org/GNOME/glib/...970689164b293a https://gitlab.freedesktop.org/freet...16d58f75c0b06e https://github.com/AviSynth/AviSynth...2682c138f84e13 https://github.com/webmproject/libwe...78655a7905620c https://github.com/PCRE2Project/pcre...df534b96377f35 https://www.sendspace.com/file/coas09 next time: https://github.com/pinterf/TIVTC/com...7e399553fa1261 https://github.com/m-ab-s/aom/commit...a5f3ded8c68540 https://gitlab.freedesktop.org/freet...c293400096868c https://gitlab.gnome.org/GNOME/glib/...48e77b667e392d https://github.com/KhronosGroup/Vulk...31f1ff9e9454bf https://github.com/KhronosGroup/Vulk...f72b0b5bcb8350 https://github.com/KhronosGroup/glsl...ea351aababeb7c https://github.com/mm2/Little-CMS/co...8da94dfba0a43f https://github.com/AcademySoftwareFo...2dfd075348b83e https://github.com/xiph/opus/commit/...f93eca5ae2038a https://www.sendspace.com/file/w2ufzl next time: https://github.com/JimmyLefevre/kb/c...7bdbf5dfdb2fbe https://github.com/fraunhoferhhi/vve...29d66bdcb113cf https://github.com/mm2/Little-CMS/co...b08011a2ec0bac https://github.com/ggml-org/whisper....be3fec6c2efc91 https://github.com/PCRE2Project/pcre...d1d9142a40e4f9 https://github.com/AviSynth/AviSynth...47e150c50d7b1c https://github.com/KhronosGroup/glsl...384f1374e0dda2 https://github.com/KhronosGroup/Vulk...39f57c3772a641 https://github.com/KhronosGroup/Vulk...a273043e12b20c https://gitlab.gnome.org/GNOME/libxm...ffd8368a9568cf https://gitlab.gnome.org/GNOME/glib/...21639c05875806 https://gitlab.freedesktop.org/freet...2cf0647d5284ff https://github.com/harfbuzz/harfbuzz...50535bca0174ec https://github.com/xiph/opus/commit/...58e5c7475ccdc0 https://github.com/webmproject/libwe...ed72823fe852a7 https://github.com/m-ab-s/aom/commit...31be91c7a4c132 https://github.com/ggml-org/whisper....c3f1a4e2ad2cef https://github.com/JimmyLefevre/kb/c...e0df9ef6166aaf https://gitlab.gnome.org/GNOME/glib/...a99ebc9f41fea6 https://github.com/madler/zlib/commi...165c1f568b96be https://github.com/tukaani-project/x...79db7de012caf5 Added the svt_jpegxs holiday add-on in ffmpeg. Might be of interest to someone. MTS files not yet supported. https://www.sendspace.com/file/41kyo2 Last edited by Jamaika; 11th January 2026 at 12:31. |
|
|
|
|
|
#57 | Link |
|
Registered User
Join Date: Jul 2015
Posts: 954
|
The latest jpegXS codecs:
https://www.sendspace.com/file/bqz9l0 next time: https://github.com/xiph/opus/commit/...d5b79ebd1b9473 https://github.com/KhronosGroup/glsl...6981c5de73a365 https://github.com/webmproject/libvp...b25b58e2f903e8 https://gitlab.gnome.org/GNOME/libxm...de719b7bc46578 https://github.com/mingw-w64/mingw-w...b1c28c1d292e97 https://github.com/KhronosGroup/Vulk...5ac2e37cef6475 https://github.com/KhronosGroup/Vulk...20177f60ed6627 https://github.com/m-ab-s/aom/commit...754d883d7b2ab4 https://gitlab.gnome.org/GNOME/glib/...b0afdc40e22657 https://github.com/JimmyLefevre/kb/c...9955beecfc2549 https://github.com/harfbuzz/harfbuzz...1a6ca8d80211ec https://github.com/opencv/opencv/com...034627e7b11183 https://github.com/ggml-org/whisper....8410c3aeb06949 https://gitlab.freedesktop.org/cairo...15b852ccf25d71 https://github.com/AviSynth/AviSynth...b990fc77dad9e9 Merry Christmas. I created something quickly. I had a big problem with the latest OpenCV2 5.0 patches. I also don't know if the latest Avisynth works. I've added the latest d3d12 add-ons for ffmpeg. I've reported any problems with jpegxs and mpeghdec as best I could. However, a separate ffmpeg add-on has been created for mpeghdec. https://github.com/Fraunhofer-IIS/MPEG-H-Audio Who cares anyway? If something doesn't work, buy something paid. https://www.sendspace.com/filegroup/...ATcNd%2FAe05rA Code:
{ "PointResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_PointResize },
{ "BilinearResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_BilinearResize },
{ "BicubicResize", BUILTIN_FUNC_PREFIX, "cii[b]f[c]f[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_BicubicResize },
{ "LanczosResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[taps]i[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_LanczosResize},
{ "Lanczos4Resize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_Lanczos4Resize},
{ "BlackmanResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[taps]i[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_BlackmanResize},
{ "Spline16Resize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_Spline16Resize},
{ "Spline36Resize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_Spline36Resize},
{ "Spline64Resize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_Spline64Resize},
{ "GaussResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[p]f[b]f[s]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_GaussianResize},
{ "SincResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[taps]i[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_SincResize},
{ "SinPowerResize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[p]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_SinPowerResize},
{ "SincLin2Resize", BUILTIN_FUNC_PREFIX, "cii[src_left]f[src_top]f[src_width]f[src_height]f[taps]i[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_SincLin2Resize},
{ "UserDefined2Resize", BUILTIN_FUNC_PREFIX, "cii[b]f[c]f[s]f[src_left]f[src_top]f[src_width]f[src_height]f[border_handling]i[force]i[keep_center]b[placement]s", FilteredResize::Create_UserDefined2Resize},
https://gitlab.gnome.org/GNOME/libxm...bf94e4237ca944 https://github.com/harfbuzz/harfbuzz...8a20c64a825d50 https://github.com/AviSynth/AviSynth...9aaa9aa0530b0d https://github.com/webmproject/libwe...2dd0c729334553 https://github.com/m-ab-s/aom/commit...d1bbc3b5e6fe0a https://github.com/xiph/opus/commit/...ef3b2144e058fd Problematic (v)(s)(n)printf functions from warming have been changed to __mingw_(v)(s)(n)printf. I added my entire opensource project what and how I tried to merge. https://www.sendspace.com/filegroup/...q7gXHpwv5mVbwg Welcome to the new year 2026 https://github.com/harfbuzz/harfbuzz...2c19e1b39b3203 https://gitlab.freedesktop.org/freet...4866346f3a7669 https://github.com/JimmyLefevre/kb/c...f3d8d998d6f62a https://github.com/AviSynth/AviSynth...9bbc027ff0eefc https://gitlab.gnome.org/GNOME/glib/...b9269b93a64e40 https://github.com/videolan/dav1d/co...da38b91a06f737 https://github.com/pytorch/cpuinfo/c...e572c5f9e5e3c4 https://github.com/ggml-org/whisper....a023889a5a7202 + fix ggml CUDA https://github.com/ggml-org/llama.cp...2eaf82c1567b7e https://github.com/libass/libass/com...fd3adee50bca88 https://github.com/ultravideo/kvazaa...0bc9503185b1fa https://github.com/madler/zlib/commi...7af91a54fe9cf6 Testing CUDA 13.1 for GCC: {currently not working yet} // C++17 For __CUDA_ARCH__ larger than 730 gcc can't handle assembler. for %%f in ("%~dp1*.cu") do g++.exe -std=gnu++17 -static -g0 -O3 -m64 -march=x86-64-v3 -mtune=generic -mthreads -mavx2 -mbmi -mbmi2 -mlzcnt -mfma -mmovbe -mhle -x c++ -Wcomment -Wformat -Wshift-negative-value -Wsign-compare -Wtype-limits -Warray-bounds -Wparentheses -Wlogical-not-parentheses -D__CUDA_ARCH__=730 -D__CUDACC__ -D__NVCC__ -D__CUDACC_VER_MAJOR__=13 -D__CUDACC_VER_MINOR__=1 -D__CUDACC_VER_BUILD__=80 -DWHISPER_VERSION="1.8.2-e443fbc" -DGGML_USE_CPU=1 -DGGML_USE_OPENCL=1 -DGGML_USE_CUDA=1 -DGGML_CUDA_USE_GRAPHS=1 -UGGML_OPENCL_SOA_Q -DGGML_OPENCL_PROFILING=1 -DGGML_OPENCL_TARGET_VERSION=300 -D__NV_NO_HOST_COMPILER_CHECK=1 -D__CUDA_API_VERSION_INTERNAL=1 -D__CUDA_INTERNAL_SKIP_CPP_HEADERS__=1 -DCUDA_API_PER_THREAD_DEFAULT_STREAM=1 -D_CRTBLD=1 -D__CORRECT_ISO_CPP_MATH_H_PROTO=1 -D_GLIBCXX_MATH_H=1 -D_GLIBCXX_USE_C99_DYNAMIC=1 -include "cudart/cuda_runtime.h" -c %%f -o %%~nf.o https://developer.download.nvidia.co....0_windows.exe https://www.sendspace.com/file/1kqj8m https://github.com/GNOME/libxml2/com...318d87a81f80fb https://gitlab.gnome.org/GNOME/glib/...04989179f73d71 https://github.com/KhronosGroup/glsl...eaa434239806d2 https://github.com/KhronosGroup/Vulk...6a0bdca24821a6 https://github.com/madler/zlib/commi...6e16ef54218f26 https://github.com/pytorch/cpuinfo/c...b5f27d13fc9551 https://github.com/AviSynth/AviSynth...c911e433e578b7 https://github.com/harfbuzz/harfbuzz...ed2cbb8dd45220 https://github.com/JimmyLefevre/kb/c...54f0a9bf9c1257 https://github.com/mm2/Little-CMS/co...f4efb7857ff511 https://github.com/xiph/opus/commit/...959b1e951df87e https://github.com/fraunhoferhhi/vve...cde534a0196991 https://gitlab.freedesktop.org/freet...37b7acec27f3dd https://github.com/m-ab-s/aom/commit...812e05dc7bbf08 https://github.com/KhronosGroup/Vulk...58bbc0ddadc773 https://www.sendspace.com/file/7dey3a https://github.com/madler/zlib/commi...7490925f086317 https://github.com/m-ab-s/aom/commit...a3e34d469740cf https://github.com/xiph/opus/commit/...bf1954b56235be https://gitlab.gnome.org/GNOME/glib/...8bd258c5e764d0 https://gitlab.freedesktop.org/freet...195b227fe19ad9 testing https://github.com/autotools-mirror/...7a396fff10781d <-- c23 https://github.com/curbengh/gnulib-m...81b72d5622e58a <-- c23 https://github.com/ggml-org/whisper....57f74ad5b691a6 --> https://github.com/ggml-org/llama.cp...0ad6e81e0dd484 mingw-w64-ucrt-x86_64-headers-13.0.0.r453.gfd36ef357-1-any.pkg.tar.zst https://github.com/simd-everywhere/s...303ee3dc94a112 https://github.com/google/highway/co...503a957404a246 https://github.com/nlohmann/json/com...5ed37d4fbfccc2 https://github.com/KhronosGroup/glsl...12944c279baf48 https://github.com/KhronosGroup/Vulk...1624583c88563f https://github.com/KhronosGroup/Vulk...cf378e66f6cc17 https://github.com/AviSynth/AviSynth...0033f2c262fb6a https://gitlab.freedesktop.org/freet...7272ee10bdbb69 https://github.com/webmproject/libvp...8a1a328143f576 https://github.com/xiph/opus/commit/...070b790edf6f5e https://github.com/m-ab-s/aom/commit...d920948ca2618d https://gitlab.gnome.org/GNOME/glib/...0653326f7e32f4 https://gitlab.gnome.org/GNOME/libxm...f2132896f415f2 https://github.com/madler/zlib/commi...f89d84955ddd24 https://github.com/google/shaderc/co...ab2bbcaac9b4d5 https://github.com/JimmyLefevre/kb/c...8f2a381aa820c7 https://github.com/harfbuzz/harfbuzz...4987536f40e9e2 ffmpeg with vulkan in C++20. I had a big problem converting const char* to const unsigned char* in C17/C++17 And it's unrealistic. In older versions of GCC, everything should be in one language, e.g., C17/C++17. Do C17/selected parts of C++20 programs work on MSYS2? ffmpeg8: const unsigned char ff_dpx_copy_comp_spv_data[] = u8"...files.vulkan.comp..."; size_t ff_dpx_copy_comp_spv_len = std::strlen(reinterpret_cast<const char*>(&ff_dpx_copy_comp_spv_data)) + 1; opencv5: std::string arithm_str = "...files.opencl..."; std::string hash1 = cv::format("%08jx", (uintmax_t)crc64(reinterpret_cast<const uchar*>(std::u8string(arithm_str.begin(), arithm_str.end()).data()), std::ssize(arithm_str))); const ProgramSource arithm_oclsrc("", "arithm", arithm_str, hash1); lcevc: std::string upscale_vertical_def = "...file.vulkan.comp..."; const unsigned char* upscale_vertical_spv = reinterpret_cast<const unsigned char*>(std::u8string(upscale_vertical_def.begin(), upscale_vertical_def.end()).data()); This is where my knowledge ends. I also don't know what the equivalents of the c23 function look like. What is the cv::format equivalent of std::format? Problem is in vulkan/opencl with DllMain. https://www.sendspace.com/file/g95gzh Last edited by Jamaika; 22nd January 2026 at 18:34. |
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|