Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
10th October 2015, 05:17 | #41 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
I further increased performance by using DirectXMath DirectX::PackedVector::XMConvertFloatToHalfStream instead of D3DXFloat32To16Array.
It went from 18.5fps to 20fps. CPU usage at only 40%. ConvertToFloat is faster when calculating in INT, but ConvertFromFloat is faster with FLOAT than with INT. |
5th November 2015, 04:19 | #42 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
SuperRes doesn't work like typical resize algorithms. It runs
around other resizers. So let's say you want to use NNEDI3, it takes the larger image and the original image, does a Bicubic resize on the enlarged image to size it back down to the original, and then creates a difference map between the two, showing the details that were lost during the upsizing. From that diff map, it restores details and edges that were lost. Brilliant idea. It makes even basic resizers like Bilinear look decent. Is not I`m thinking your work is bad, but it looks like to me more like a detail restoration than detail "add", case of super resolution. In single image, sr works by searching seems patterns and slightly different details in each one of them.
__________________
Searching for great solutions |
6th November 2015, 00:11 | #43 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
New version v0.9.1 is released. It greatly reduces memory usage!! This version allows running several shaders in a row by creating command chains and calling ExecuteShader() at the end.
https://github.com/mysteryx93/AviSynthShader As for SuperRes, it cannot yet fully benefit from this as I'm still missing a Bicubic downscaling shader that needs to be run in the middle. I can only combine 2 of the shader calls (twice if doing 2 passes), yet that's enough to considerably reduce memory usage. ConvertToFloat and ConvertFromFloat have also been modified to reduce memory usage. With this version, you'll be able to run 8 threads without any issue. If I can get a Bicubic downscaler, then we could remove unnecessary ConvertFromFloat and ConvertToFloat, as well as chain all of the commands to run at once, which would greatly improve memory usage and performance. This Cubic code would work for Bicubic upscaling, but Bicubic downscaling requires a few tweaks. I can't do this as I know nothing about HLSL programming. https://github.com/zachsaw/MPDN_Exte...er/Chroma.hlsl
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 00:15. |
6th November 2015, 00:42 | #44 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
Here's a comparison of the image quality.
Original Spline16 NNEDI3(nns=4) NNEDI3(nns=4)+SuperRes(passes=2, strength=.42) Result speak for themselves. It makes the image shaper without creating any artificial details. If I eventually get a Bicubic HLSL downscaling, there might be a 'slight' further quality improvement.
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 02:02. |
6th November 2015, 01:10 | #46 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
MysteryX, I didn't speak any SR algorithm creates artificial details. What I wanted say is common SR algorithms analizes a set of neighbour frames or, in a single image, a lot of similar patterns, in order to, let's say, replicate details from a frame to another.
Your algorithm first upscale image using some algorithm, then downscale it using BiCubic and compare it to original image, creates a difference map, upscale the missing parts and paste into upscale image. This was what I understood. But, generally, upscaling are detail lossy and downscaling, too. So, probably, the difference map will show so much difference. What I really want know is how the details that are in original image, but aren't in downscaled upscaled image, are pasted into upscaled image, i.e, how this details are upscaled. David Horman, I don't see so much difference too. The maximum I see was some ringing, in SuperRes image, disappear.
__________________
Searching for great solutions |
6th November 2015, 01:46 | #48 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
Here's some tests with the lighthouse and clown.
Original Spline16 NNEDI3(nns=4, cshift="Spline16Resize") NNEDI3+SuperRes(passes=2, strength=.42) SuperRes+nnedi3_rpow2(rfactor=2, nsize=0, nns=4, qual=2, etype=0, cshift="SincResize", ep0=4, threads=0, opt=0, fapprox=0) NNEDI3(nns=4, cshift="Spline16Resize")+SuperRes(passes=3, Strength=1, Softness=.85) I see more difference between NNEDI3 and NNEDI3+SuperRes than between Spline16 and NNEDI3. There's also something funny happening with the reds... the reds are different but actually looks better with SuperRes. I've seen in a video with a chair where half of it was plain red (color cropping), that the texture of the chair somehow came back after passing it through SuperRes and it looked more like a chair afterwards. It must have to do with the way it's doing color conversion, but it's accidental. It seems to 'sometimes' recover cropped colors. Somehow. Another time I've seen it turn overflow colors into some other color
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 02:43. |
6th November 2015, 02:00 | #49 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
luquinhas0021, I couldn't answer the technical of how it's doing its job internally, but in SuperRes.avsi, you can see the diff map by returning the output of SuperResDiff.cso instead of processing the image with it.
Bloax, here's the result with your image
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 02:40. |
6th November 2015, 02:06 | #50 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
Nice, MysteryX. The differences from nnedi3 upscaling to superres nnedi3 upscaling, at least the ones I realize, are less haloing and more sharpness. What if use the Sinc4, Lanczos4 and apply super resolution in each one?! I believe you apply SR in nnedi3 nns=4 with default parameters. What would happen if SR was applied in this script...?
nnedi3_rpow2(rfactor=2, nsize=0, nns=4, qual=2, etype=0, cshift="SincResize", ep0=4, threads=0, opt=0, fapprox=0)
__________________
Searching for great solutions |
6th November 2015, 02:31 | #52 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
it stayed sharper than all others algorithms! As I expected. After, I will test your SR algorithm with Lanczos4, because Sinc makes much ringing.
Only one question: You alrady put etype=1 (minimize squared error)? Comparing with etype=0, what you like more? I JUST SEE YOU USED SINC EP0 =4.
__________________
Searching for great solutions Last edited by luquinhas0021; 6th November 2015 at 02:39. |
6th November 2015, 02:39 | #53 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
I copy/pasted your command. I personally prefer the Spline16 over Sinc which looks more artificially processed.
I'm adding another test: NNEDI3(spline16)+SuperRes(Passes=3, Strength=1, Softness=.85) Shiandow added Softness for the purpose of being able to use higher strength and passes and then softening it down. So far I wasn't convinced... I'll see better with this test. The previous tests I made, these settings looked good too, but not better than Passes=2, Strength=.42. Let's see what we get! HD pictures might make it better. OK. With Softness, Clown and Eclipse look a LOT better, but it makes Lighthouse look like a painting. It works for some content but not all. Passes=2 with Strength=.42 gives more consistent results.
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 02:46. |
6th November 2015, 02:49 | #54 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
The algorithm you just post generates more ringing in letters of Eclipse image. But increases sharpness on Clown image. Maybe use 3 passes of superres don't be a gorgeus thing to do in all images. I suggest to you use my script, instead (...cshift="SincResize", ep0=4...), use (...cshift="Spline144resize"...) with passes=2, strenght=1 and softness=0,4.
In Lighthouse image, you last script generated aliasing in some parts.
__________________
Searching for great solutions Last edited by luquinhas0021; 6th November 2015 at 02:54. |
6th November 2015, 07:07 | #55 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
Isn't Spline144 a broken algorithm? You're free to play with the algorithms and post your results.
Here are some tests with the lighthouse with NNEDI3(nns=4, cshift="Spline16Resize") SuperRes(passes=2, strenth=XXX, softness=0) Strength=30, 40, 50, 60, 100 SuperRes(passes=3, strenth=1, softness=XXX) 30, 40, 50, 60, 70, 80, 90, 100 Which one is your favorite? When playing with madVR, I found that NEDI+SuperRes were doing a good job together; would be worth a try to see how it compares to NNEDI3. Jinc+SuperRes, however, don't go well together. SuperRes(passes=2, strength=.4, softness=0) NNEDI3(nns=4, cshift="Spline16Resize") NNEDI2(cshift="Spline16Resize") NNEDI3(nns=3, cshift="Spline16Resize") NNEDI3(nns=1, cshift="Spline16Resize") Honestly... NNEDI2 is doing ALMOST as good as NNEDI3(nns=4), but NNEDI3 with lower NNS gets blurrier. NNEDI2 gives a sharp output. The only downside is distortion on the white bars.
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 07:40. |
6th November 2015, 22:24 | #56 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
I have added a variant of SuperRes that does the YUV conversion via shaders. Performance is slightly lower, but the quality of colors is better. When doing YUV conversion on the CPU, I get 21fps. However, it is doing Rec.601 color conversion on Rec.709 content! When doing YUV conversion via shaders, I get 17fps (including processing NNEDI3). The first implementation might cause a very slight color distortion, and the 2nd implementation makes the colors more vivid.
To use this variant, use file SuperResYUV.avsi Here's the comparison. Using SuperRes(passes=2, strength=.42, softness=0) CPU conversion / GPU conversion EDIT: Now this is embarrassing and strange... when I put both versions side-to-side on my computer, I can clearly see a difference. But once converted to PNG, I honestly can't see any difference at all! Perhaps whatever color this makes is being discarded by PNG compression? But PNG is supposed to be lossless. Not sure on this one.
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 6th November 2015 at 22:35. |
6th November 2015, 22:40 | #57 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
I don`t know if you saw things like me, but there`s a strange effect in lighthouse when use softness with passes=3. The bigger is amount of softness, a little, little more sharper it`s this photo. Don`t know why!
You asked me about which parameters I like. Then, for not take risk, I preffer passes=2, strenght=1, softness=0. From passes=3, looks like algorithm blurs the photo, and, with passes=3, softness sharp then. I may be wrong, but this is what I`ve seen.
__________________
Searching for great solutions |
6th November 2015, 23:48 | #58 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
Yeah, I don't like the effect of Softness, it's not working well. Shiandow might improve it in the future. For now, Passes=2 and Strength=.42 works best for noisy material, and for clear images, you can get away with 2 passes of strength up to 1.
I figured out what I did wrong in the last test: I was upscaling with nns=1 and comparing with the previous results that had nns=4! Here's a new test of frame quadrupling. Spline16 edi_rpow2(2, nns=4, cshift="Spline16Resize") uses NNEDI3 but fixes a few details SuperRes(passes=2, strength=.42) SuperRes (YUV conversion done on GPU) SuperRes (YUV conversion done on GPU) + NNEDI2 Interestingly enough, the 3rd one (which is the I had been testing before) has some distortion with frame quadrupling! Not sure where that's coming from... The newer implementation doesn't have that distortion... or is that the distortion I saw when using NNEDI2? Here's also with NNEDI2 to compare. EDIT: NNEDI2 is *slower* than NNEDI3(nns=4), so we can discard it, although it gives *almost* identical results. The new implementation with YUV conversion on the GPU does give considerably better image quality. There's a lot less color distortion on the Lighthouse.
__________________
FrameRateConverter | AvisynthShader | AvsFilterNet | Natural Grounding Player with Yin Media Encoder, 432hz Player, Powerliminals Player and Audio Video Muxer Last edited by MysteryX; 7th November 2015 at 05:35. |
7th November 2015, 05:48 | #60 | Link |
Soul Architect
Join Date: Apr 2014
Posts: 2,559
|
I have improved the performance of the YUV conversion via shaders.
You'll need Shader.dll and the AVSI and CSO files within "Shaders\SuperRes". You might have to specify the "folder" argument to tell SuperRes.avsi where to find all the CSO files. Ideally I'll want to automate that parameter. SuperRes converts colors on the CPU, while SuperResYUV converts via Shaders. I'll probably remove the CPU-conversion implementation as the other one gives better quality, and now performance is similar. It "was" slower but then the CPU usage was also lower so you could just increase the amount of threads to make it up. Now it runs even better. |
|
|