Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 31st January 2018, 15:31   #501  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by Aktan View Post
I think it is still related to Intel compiler.
What do you base that on? Most of the NNEDI3 performance-critical code is in hand-written assembler in this version. In the Vapoursynth version (which uses mostly compiler intrinsics in C++ rather than pure asm) you could argue that the compiler could have a big impact, but not in this version. In fact it's been repeatedly pointed out in this very thread that having a special DLL with compiler optimizations for AVX2 enabled is completely pointless - it has essentially no performance impact at all, since there's nothing performance critical that the compiler can actually optimize.
TheFluff is offline   Reply With Quote
Old 31st January 2018, 15:34   #502  |  Link
Aktan
Registered User
 
Join Date: Feb 2002
Posts: 303
Quote:
Originally Posted by TheFluff View Post
What do you base that on? Most of the NNEDI3 performance-critical code is in hand-written assembler in this version. In the Vapoursynth version (which uses mostly compiler intrinsics in C++ rather than pure asm) you could argue that the compiler could have a big impact, but not in this version. In fact it's been repeatedly pointed out in this very thread that having a special DLL with compiler optimizations for AVX2 enabled is completely pointless - it has essentially no performance impact at all, since there's nothing performance critical that the compiler can actually optimize.
I will admit it was just a bad guess as I'm not that familiar with compilers. I guess I won't be able to compile it then since there was hand written ASM. I based it on the fact that in the past Intel did selectively optimize to only Intel chips in their compiler.

Edit: Just curious, if I use the Release_Intel_W7_Core2_AVX DLL or even the Release_Intel_W7_Core2_SSE4.2 DLL and set opt to 5, would AVX2 still be used? I tried that and I still get the same slow speed. I was expecting it to throw an error like invalid opt mode, but I was just guessing.

Last edited by Aktan; 31st January 2018 at 15:45.
Aktan is offline   Reply With Quote
Old 31st January 2018, 15:50   #503  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
What happens if you use threads=1?

BTW, if you're willing to try Vapoursynth, you can try znedi3, an attempt at making NNEDI3 on the CPU competitive with the OpenCL version running on the GPU. IIRC it was something like 50-100% faster than the original VS NNEDI3 on 8-bit input, but I don't know if anyone ever tested it on Ryzen.

Last edited by TheFluff; 31st January 2018 at 15:53.
TheFluff is offline   Reply With Quote
Old 31st January 2018, 15:53   #504  |  Link
Aktan
Registered User
 
Join Date: Feb 2002
Posts: 303
Quote:
Originally Posted by TheFluff View Post
What happens if you use threads=1?

BTW, if you're willing to try Vapoursynth, you can try znedi3, an attempt at making NNEDI3 on the CPU competitive with the OpenCL version running on the GPU. IIRC it was something like 50-100% faster than the original VS NNEDI3 on 8-bit input.
I'll try both in a bit.

Edit: threads=1:

Code:
opt	FPS
4	~4
5	0.11 (~1 frame every 9 seconds)

Last edited by Aktan; 31st January 2018 at 16:02.
Aktan is offline   Reply With Quote
Old 31st January 2018, 16:59   #505  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
It's not Ryzen. This one:
Code:
nnedi3_rpow2(rfactor=4, cshift="lanczosresize", fwidth=2880, fheight=1920, ep0=2, nsize=0, nns=4, qual=2, pscrn=4, opt=5)
is giving me 0.03 fps for the first 3-4 frames with opt=0 or opt=5
and >3 fps with opt=4

x64 version freshly recompiled with VS2017 15.5
Intel i7-7700 (AVX2), Win10 x64, Avisynth+ 2591 (dev)
pinterf is offline   Reply With Quote
Old 31st January 2018, 17:02   #506  |  Link
Aktan
Registered User
 
Join Date: Feb 2002
Posts: 303
That's interesting, I guess now to figure out which of the param is actually causing it.
Aktan is offline   Reply With Quote
Old 31st January 2018, 17:05   #507  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Quote:
Originally Posted by Aktan View Post
Edit: Just curious, if I use the Release_Intel_W7_Core2_AVX DLL or even the Release_Intel_W7_Core2_SSE4.2 DLL and set opt to 5, would AVX2 still be used?
No differences in the code, just compiler options, all versions will behave the same.
jpsdr is offline   Reply With Quote
Old 31st January 2018, 17:10   #508  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Odd... I don't remember so much differences when testing... But on my tests, i've never changed ep0 and always tested pscrn with 2 or 0.
Btw, with 0, very small frames (480p) to have some frames without having to wait hours...
Don't have time for now, but will redo tests later.
jpsdr is offline   Reply With Quote
Old 31st January 2018, 17:11   #509  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Missing else

https://github.com/jpsdr/NNEDI3/blob...edi3.cpp#L1725

https://github.com/jpsdr/NNEDI3/blob...edi3.cpp#L1734

fps is OK again for AVX2

(And a request: the AVX2 asm files were missing from the sln file had to add them manually)
pinterf is offline   Reply With Quote
Old 31st January 2018, 17:16   #510  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
I was about to said that i'll check if i didn't messed-up something, but you beats me...

About ASM files :
Yes it's normal they are missing, because i want to keep the project on github on VS2010 (if you want to use more than VS2010, you can just upgrade the project), but VS2010 will not be able to to compile the AVX2 asm files, so it will result on a incorrect project if they were included. This is why the AVX2 asm are on others files.

Last edited by jpsdr; 31st January 2018 at 17:37.
jpsdr is offline   Reply With Quote
Old 31st January 2018, 17:31   #511  |  Link
Aktan
Registered User
 
Join Date: Feb 2002
Posts: 303
Awesome catch! Yep, changing prescreen to original (1) where the bug doesn't exist gave me results comparable to AVX.
Aktan is offline   Reply With Quote
Old 31st March 2018, 10:11   #512  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
New version, see first post, and i've also added on it a part about the multi-threading.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 3rd April 2018, 12:10   #513  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
There is issue with the Intel versions.

I'll update the release files on github, removing the Intel versions, and keeping only VS version, and adding an VS AVX2 version. Wait at least 24h to check/re-download the files.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 3rd April 2018, 20:30   #514  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Trashed Intel version, file updated, redownload it.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 7th April 2018, 12:47   #515  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
New version, see first post, updated also the Multi-treading text part.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 25th May 2018, 15:32   #516  |  Link
mp3dom
Registered User
 
Join Date: Jul 2003
Location: Italy
Posts: 1,135
It seems there's a bug in YUY2 colorspace that create a green vertical bar on the right side of the image under some circumstances (both x86 and x64):

Code:
colorbars(width=1416,height=1080,pixel_type="yuy2")
nnedi3(0)
mp3dom is offline   Reply With Quote
Old 25th May 2018, 20:13   #517  |  Link
Taurus
Registered User
 
Taurus's Avatar
 
Join Date: Mar 2002
Location: Krautland
Posts: 903
Quote:
Originally Posted by mp3dom View Post
It seems there's a bug in YUY2 colorspace that create a green vertical bar on the right side of the image under some circumstances (both x86 and x64):

Code:
colorbars(width=1416,height=1080,pixel_type="yuy2")
nnedi3(0)
But only if the width is at some "odd" resolutions...not devidable with mod16.
Older versions show the same behaviour.
Just a short test..
Taurus is offline   Reply With Quote
Old 25th May 2018, 20:58   #518  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Thanks for reporting, i've found what was wrong.
I want to finish something else i'm working on before making new releases of some filters. So, several days before i'll make new builds, but it's fixed on Github.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 26th May 2018, 12:35   #519  |  Link
mp3dom
Registered User
 
Join Date: Jul 2003
Location: Italy
Posts: 1,135
Quote:
Originally Posted by jpsdr View Post
So, several days before i'll make new builds, but it's fixed on Github.
Thanks
mp3dom is offline   Reply With Quote
Old 1st June 2018, 10:01   #520  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
New version, see first post.
__________________
My github.
jpsdr is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:59.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.