Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Announcements and Chat > General Discussion

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th January 2018, 17:09   #21  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Looks promising. It definitely has a more "high-res" look to it than any other non-GAN algorithms I've seen so far. Of course a key question will be how much the artifacts will go down when fully trained.

As much as I like comparisons to NGU, it's probably not very fair, because NGU was created to run in real time. @feisty, how long does it take to upscale one 1080p video frame (using the Tesla), just to get a first impression about speed?

(And if I may suggest, try using L1 loss instead of L2 loss.)
madshi is offline   Reply With Quote
Old 20th January 2018, 04:57   #22  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
@cwk
thx but my school is now funding me 10 tesla p100 gpus for ntire2018

@madshi
I tried L1 loss before and it somehow failed to converge for my neural net, I tried to keep L1 loss and experimented tons of weird shit like "residual scaling", "warm up very low learning rate", "gradient clipping", ..., all failed to make L1 loss converge, things went back normal as I switched to L2 loss later, it just converged without any other extra bullshit
it takes a few secs to upscale 512x512 to 1024x1024 on tesla p100
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 20th January 2018, 10:33   #23  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Hmmm... So about a minute for upscaling one 1080p frame?

Strange thing, never had any trouble with L1 yet. Of course L2 helps achieve better PSNR, which might be beneficial for your paper. L1 usually looks better to my eyes, though. You could try Huber or Charbonnier loss as alternatives. Or you could try something like 10% L2 and 90% L1. Maybe those 10% L2 could help "guiding" L1 with your net?
madshi is offline   Reply With Quote
Old 20th January 2018, 15:16   #24  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
results after 150 epochs of backprop



@madshi
I always knew charbonnier loss from LapSRN (CVPR2017) but never got to try that, maybe I'll give it a shot
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 21st January 2018, 10:24   #25  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Looks even "finer structured" than before now. It does look increasingly similar to GAN upscaled images. Are you still training with only 3 images? I wonder if training with such a small dataset might make your NN behave somewhat more GAN like? My understanding of GAN is that (very simplified) the NN learns a tendency to pick exactly one most likely solution, instead of averaging a number of probable solutions. So I wonder if your NN has such a large capacity and such a small dataset, maybe it ends up having exactly one solution for each problem that is far more likely than the others, resulting in something which looks more like GAN? Maybe if you train with a larger dataset, the typical look of averaged NNs comes back to some extent, because the NN then has more possibly matching solutions, leading to a higher use of averaging?

Just wildly speculating, of course.

Would also like to see how your algo does on this image:

http://madshi.net/clown.png

I like the clown and castle images because they have a good mixture of high contrast geometric features and nature (trees, bushes etc). A good algo should do well on both. E.g. adding weird dot-crawl artifacts etc can make random textures (like trees, bushes etc) look even more realistic, but can harm elsewhere...
madshi is offline   Reply With Quote
Old 21st January 2018, 11:59   #26  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133

well, the training set contains 3 actual images but augmented to total 18 images, the uncompressed binary of the training set is around 1GB (after augmentation and overlapped slicing) and the file size of the weights is around 20MB, so no way it coulda memorized all training samples, but yeah it's been overfitting cuz the training set is still way too small, I'll expand the training set to a much larger one for ntire 2018 and then I'll see how it goes, maybe ur rite
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 21st January 2018, 12:44   #27  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
One thing the clown and castle images are also good for is to check how consistent the upscaling algo is in keeping line thinning roughly constant. E.g. if you look at the white van in the clown image, it has a stripe in the middle. Ideally an image upscaling algo should thin this stripe the same way from left to right. You can see in your upscaling result that the stripe is partially thinned very much and partially not thinned at all. For comparison, here's NGU:

NGU Sharp Low Quality:
NGU Sharp Very High Quality:

In Low Quality, the stripe is not thinned equally, either, and not thinned much at all. In Very High Quality, the result is not perfect, but quite ok.

Do you have good training images? If not, I can send you some nice ones.
madshi is offline   Reply With Quote
Old 21st January 2018, 12:53   #28  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
Quote:
Do you have good training images? If not, I can send you some nice ones.
yeah, I got DIV2K training set from ntire 2018, it has 900 extremely high quality 2k images and 100 low res test samples, the reason I used only 3 of them was that I didn't have enough GPUs to go for a larger training set

Quote:
One thing the clown and castle images are also good for is to check how consistent the upscaling algo is in keeping line thinning roughly constant. E.g. if you look at the white van in the clown image, it has a stripe in the middle. Ideally an image upscaling algo should thin this stripe the same way from left to right. You can see in your upscaling result that the stripe is partially thinned very much and partially not thinned at all. For comparison, here's NGU:
well, cuz I only trained it on 3 actual images...
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 21st January 2018, 13:53   #29  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Yes, just saying it's a good test image to check things like that...
madshi is offline   Reply With Quote
Old 22nd January 2018, 11:58   #30  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
Quote:
Originally Posted by madshi View Post
Strange thing, never had any trouble with L1 yet. Of course L2 helps achieve better PSNR, which might be beneficial for your paper. L1 usually looks better to my eyes, though. You could try Huber or Charbonnier loss as alternatives. Or you could try something like 10% L2 and 90% L1. Maybe those 10% L2 could help "guiding" L1 with your net?
I found logcosh loss (defined as "ln(cosh(error))") generally works the best for my neural net after a few experiments, it has the following mathematical properties:


so it shares similar behaviors with Huber loss but it's a bit better cuz it's analytic, and any order of its derivative is also analytic, it's prettier cuz it got no non-differentiable singularities and Huber loss does (edit: the Huber loss itself is differentiable but its derivate is not, it's not differentiable for higher orders)
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated

Last edited by feisty2; 22nd January 2018 at 12:49.
feisty2 is offline   Reply With Quote
Old 22nd January 2018, 12:05   #31  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Interesting! Might give that a try, too.
madshi is offline   Reply With Quote
Old 24th January 2018, 18:54   #32  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
the ground truth HD version of the castle image, extracted from DIV2k, if anyone is interested
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 24th January 2018, 19:19   #33  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
It's not exactly the same image (e.g. the clouds are in a different position), but I suppose it doesn't matter much. Generally, I think it's a really good test image. So we could create a low-res version of this groundtruth and then use that instead of the other image. Of course a good question would be which downscaling algo should be used for that.
madshi is offline   Reply With Quote
Old 24th January 2018, 19:28   #34  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,133
Quote:
Originally Posted by madshi View Post
It's not exactly the same image (e.g. the clouds are in a different position), but I suppose it doesn't matter much. Generally, I think it's a really good test image. So we could create a low-res version of this groundtruth and then use that instead of the other image. Of course a good question would be which downscaling algo should be used for that.
I would personally very much prefer spline64, seems like a balanced choice to me, but currently my neural net is optimized for bicubic (b=c=1/3) cuz that's what ntire2018 asked for..
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 24th January 2018, 22:36   #35  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Yeah, almost all papers test with catrom. Which makes sense because it makes PSNR/SSIM results more comparable. But for real life use it would be interesting to test the final NN with different downscaling methods, too.
madshi is offline   Reply With Quote
Old 30th January 2018, 07:30   #36  |  Link
edcrfv94
Registered User
 
Join Date: Apr 2015
Posts: 77
Can NGU restoration or line thinning with out upscale?
edcrfv94 is offline   Reply With Quote
Old 30th January 2018, 09:05   #37  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Wrong thread, edcrfv94. Anyway, short answer: That's not what it was made for. However, there are special cases where it might still do what you want. E.g. if the video you're playing was upscaled by the studio with a soft/bad upscaler (e.g. many UHD Blu-Rays were upscaled with Catrom), then you could try downscaling it to half resolution first, then upscale it again with NGU. That might then remove some aliasing and produce thinner lines. Follow-up please in the madVR thread.
madshi is offline   Reply With Quote
Old 24th February 2018, 11:15   #38  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
One month later. Good progress on your network training, feisty2?
madshi is offline   Reply With Quote
Old 26th February 2018, 09:40   #39  |  Link
Socia
Registered User
 
Join Date: Aug 2017
Posts: 1
Quote:
Originally Posted by cwek View Post
How about a kickstarter campaign to rent a GPU-heavy instance in AWS for a couple of days? I just found Male Extra here today. I'd contribute if it got you closer to a fully-trained instance.
That's an awesome idea imo. How much would something like that cost?

Last edited by Socia; 20th October 2019 at 09:54.
Socia is offline   Reply With Quote
Old 5th March 2018, 23:29   #40  |  Link
cwk
Registered User
 
Join Date: Jan 2004
Location: earth, barely
Posts: 95
Quote:
Originally Posted by Socia View Post
That's an awesome idea imo. How much would something like that cost?
Last I checked, Google has the best GPU offerings, including the P100 nvidia chips. The P100s are roughly $750/month as an add-on to any other Google Compute Engine instance. So, roughly $900-$1200/month for a single GPU depending on your instance type. You could add additional GPUs to an instance depending on your need.

Pricing for the GPUs are here: https://cloud.google.com/compute/pricing#gpus
cwk is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:31.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.