Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd April 2019, 16:41   #1  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 1,173
FluxSmooth - pfmod

As the title says.
An almost complete rewrite. x86/x64, clang, AVX2.

https://github.com/pinterf/FluxSmooth/releases

Packet contains DLLs compiled with clang (LLVM), and (probably) XP compatible Microsoft Visual CPP compiled DLLs.

Enjoy.

Code:
- (20190402) v1.3, rewrite by pinterf
  - project moved to github: https://github.com/pinterf/FluxSmooth
  - Built using Visual Studio 2017, additional LLVM 8.0 clang support
  - Changed to AVS 2.6 plugin interface
  - x64 build for Avisynth+
  - Added version resource to DLL
  - Removed MMX support, requires SSE2. (Though pure C is still available in the source)
  - Drop all inline assembly, SIMD intrinsics based on C code, SSE2, SSE4.1 and AVX2 optimizations
  - Single DLL, optimizations for different CPU instruction sets are chosen automatically.
  - Reports MT Modes for Avisynth+: MT_NICE_FILTER
  - Added Y, YV411, YV16 and YV24, 10-16 bits 4:2:0, 4:2:2, 4:4:4, planar RGB(A) 8-16 bits support besides existing YV12
  - (YUY2 support with workaround: internally converted to YV16, process and convert back 
    conversion is lossless, but slower than using native YV16)
  - New parameters: bool "luma", bool "chroma" (default true) to disable processing of luma/chroma planes
And an interesting benchmark:
Code:
ColorBars(pixel_type = "YV12")
FluxSmoothST(12,10,opt=0) # 0: C, #1: SSE2, #2: SSE4.1, #3: AVX2
#or:
#FluxSmoothT(12,opt=0) # 0: C, #1: SSE2, #2: SSE4.1, #3: AVX2

# Benchmarks i7-7770 Win10 Avisynth+ r2837 (fps)

# 32 bit LLVM 8.0           32 bit Visual C++ 15.9
#    opt=0  1    2    3     opt=0  1   2    3
# ST 1765 1882 2509 4776    1380 1540 1840 3730 
#  T 1834 3539 6116 12658   1530 3390 5210 10860

# 64 bit LLVM 8.0           64 bit Visual C++ 15.9
# ST 2020 1970 2520 5100    1520 1700 2048  4303 
#  T 2220 3600 6570 12700   2110 3630 5306 13100

Last edited by pinterf; 3rd April 2019 at 05:18. Reason: Benchmark typo in title
pinterf is offline   Reply With Quote
Old 3rd April 2019, 01:42   #2  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Germany
Posts: 556
Quote:
and (probably) XP compatible
It is Windows XP compatible (up to SSE4.1 of course due to the lack of AVX implementation on XP at OS level).
Link: Image
Many thanks!!
__________________
Broadcast Encoder
LinkedIn
FranceBB is offline   Reply With Quote
Old 3rd April 2019, 09:35   #3  |  Link
StvG
Registered User
 
Join Date: Jul 2018
Posts: 55
Thanks!

Btw does AVX-512 optimization require a lot of work/time?
StvG is offline   Reply With Quote
Old 3rd April 2019, 09:50   #4  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 1,173
"Can you play piano?" "Dunno, never tried"
Possibly not, but I'd need a tester.
Is it such a bottleneck?
And a question/feedback request: did this version improved your fps in real life scenario and if so, how much?

EDIT:
Test 1.4 w/ AVX512 (AVX512F and AVX512BW required)
https://drive.google.com/open?id=1Uz...NKiEQ-kGq9wcWP

Last edited by pinterf; 3rd April 2019 at 17:33. Reason: version test2
pinterf is offline   Reply With Quote
Old 6th April 2019, 00:47   #5  |  Link
StvG
Registered User
 
Join Date: Jul 2018
Posts: 55
No, it's not a bottleneck.
It's used usually in combination with other filters in real life scenario and there is no difference (< 1%) between this version and the old (SSSE3) for the final fps. This version is ~3x faster when used only FluxSmoothST().

Actually the question of AVX512 was not specific to this plugin but rather as general.

AVX2.clang
Code:
FFVideoSource("source.mkv")
fluxsmoothst()


AVX512.vs
Code:
FFVideoSource("source.mkv")
LoadPlugin(".\FluxSmooth-pfmod-v1.4-test2\x64_xp\FluxSmooth.dll")
fluxsmoothst()


AVX512.clang
Code:
FFVideoSource("source.mkv")
LoadPlugin(".\FluxSmooth-pfmod-v1.4-test2\x64\FluxSmooth.dll")
fluxsmoothst()



A real life scenario - AVX2 vs AVX512:

AVX2
Code:
FFVideoSource("1080p.source.mkv")
z_ConvertFormat(pixel_type="RGBPS",colorspace_op="709:709=>rgb:linear", resample_filter_uv="spline36", cpu_type="avx2")
z_ConvertFormat(854,480,resample_filter="spline36",pixel_type="YV12",colorspace_op="rgb:linear=>709:709",dither_type="error_diffusion", cpu_type="avx2")


AVX512
Code:
FFVideoSource("1080p.source.mkv")
z_ConvertFormat(pixel_type="RGBPS",colorspace_op="709:709=>rgb:linear", resample_filter_uv="spline36", cpu_type="skylake-x")
z_ConvertFormat(854,480,resample_filter="spline36",pixel_type="YV12",colorspace_op="rgb:linear=>709:709",dither_type="error_diffusion", cpu_type="skylake-x")
StvG is offline   Reply With Quote
Old 6th April 2019, 07:24   #6  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 1,173
StvG, thanks for the tests (and giving me motivation for experimenting with this new topic).
I was doing my tests with Intel SDE emulator, which is good only for seeing if there is any difference in the results. But I'm not able to do benchmarks, emulation is way too slow.
Actually, Microsoft build (in the xp folder) cannot do AVX512, specifically AVX512BW, so AVX512 parts are missing from that build.
The issue was reported to MS.
pinterf is offline   Reply With Quote
Old 8th April 2019, 19:50   #7  |  Link
StvG
Registered User
 
Join Date: Jul 2018
Posts: 55
Thanks for your time.
If you need more benchmarks, I could do them.
StvG is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:12.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.