Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
31st January 2018, 15:31 | #501 | Link |
Excessively jovial fellow
Join Date: Jun 2004
Location: rude
Posts: 1,100
|
What do you base that on? Most of the NNEDI3 performance-critical code is in hand-written assembler in this version. In the Vapoursynth version (which uses mostly compiler intrinsics in C++ rather than pure asm) you could argue that the compiler could have a big impact, but not in this version. In fact it's been repeatedly pointed out in this very thread that having a special DLL with compiler optimizations for AVX2 enabled is completely pointless - it has essentially no performance impact at all, since there's nothing performance critical that the compiler can actually optimize.
|
31st January 2018, 15:34 | #502 | Link | |
Registered User
Join Date: Feb 2002
Posts: 303
|
Quote:
Edit: Just curious, if I use the Release_Intel_W7_Core2_AVX DLL or even the Release_Intel_W7_Core2_SSE4.2 DLL and set opt to 5, would AVX2 still be used? I tried that and I still get the same slow speed. I was expecting it to throw an error like invalid opt mode, but I was just guessing. Last edited by Aktan; 31st January 2018 at 15:45. |
|
31st January 2018, 15:50 | #503 | Link |
Excessively jovial fellow
Join Date: Jun 2004
Location: rude
Posts: 1,100
|
What happens if you use threads=1?
BTW, if you're willing to try Vapoursynth, you can try znedi3, an attempt at making NNEDI3 on the CPU competitive with the OpenCL version running on the GPU. IIRC it was something like 50-100% faster than the original VS NNEDI3 on 8-bit input, but I don't know if anyone ever tested it on Ryzen. Last edited by TheFluff; 31st January 2018 at 15:53. |
31st January 2018, 15:53 | #504 | Link | |
Registered User
Join Date: Feb 2002
Posts: 303
|
Quote:
Edit: threads=1: Code:
opt FPS 4 ~4 5 0.11 (~1 frame every 9 seconds) Last edited by Aktan; 31st January 2018 at 16:02. |
|
31st January 2018, 16:59 | #505 | Link |
Registered User
Join Date: Jan 2014
Posts: 2,314
|
It's not Ryzen. This one:
Code:
nnedi3_rpow2(rfactor=4, cshift="lanczosresize", fwidth=2880, fheight=1920, ep0=2, nsize=0, nns=4, qual=2, pscrn=4, opt=5) and >3 fps with opt=4 x64 version freshly recompiled with VS2017 15.5 Intel i7-7700 (AVX2), Win10 x64, Avisynth+ 2591 (dev) |
31st January 2018, 17:10 | #508 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
Odd... I don't remember so much differences when testing... But on my tests, i've never changed ep0 and always tested pscrn with 2 or 0.
Btw, with 0, very small frames (480p) to have some frames without having to wait hours... Don't have time for now, but will redo tests later. |
31st January 2018, 17:11 | #509 | Link |
Registered User
Join Date: Jan 2014
Posts: 2,314
|
Missing else
https://github.com/jpsdr/NNEDI3/blob...edi3.cpp#L1725 https://github.com/jpsdr/NNEDI3/blob...edi3.cpp#L1734 fps is OK again for AVX2 (And a request: the AVX2 asm files were missing from the sln file had to add them manually) |
31st January 2018, 17:16 | #510 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
I was about to said that i'll check if i didn't messed-up something, but you beats me...
About ASM files : Yes it's normal they are missing, because i want to keep the project on github on VS2010 (if you want to use more than VS2010, you can just upgrade the project), but VS2010 will not be able to to compile the AVX2 asm files, so it will result on a incorrect project if they were included. This is why the AVX2 asm are on others files. Last edited by jpsdr; 31st January 2018 at 17:37. |
3rd April 2018, 12:10 | #513 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
There is issue with the Intel versions.
I'll update the release files on github, removing the Intel versions, and keeping only VS version, and adding an VS AVX2 version. Wait at least 24h to check/re-download the files.
__________________
My github. |
25th May 2018, 15:32 | #516 | Link |
Registered User
Join Date: Jul 2003
Location: Italy
Posts: 1,135
|
It seems there's a bug in YUY2 colorspace that create a green vertical bar on the right side of the image under some circumstances (both x86 and x64):
Code:
colorbars(width=1416,height=1080,pixel_type="yuy2") nnedi3(0) |
25th May 2018, 20:13 | #517 | Link | |
Registered User
Join Date: Mar 2002
Location: Krautland
Posts: 903
|
Quote:
Older versions show the same behaviour. Just a short test.. |
|
25th May 2018, 20:58 | #518 | Link |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,316
|
Thanks for reporting, i've found what was wrong.
I want to finish something else i'm working on before making new releases of some filters. So, several days before i'll make new builds, but it's fixed on Github.
__________________
My github. |
|
|