Doom9's Forum - View Single Post - Enhance! RAISR Sharp Images with Machine Learning

feisty2 · 22nd November 2017, 16:41

Quote:

Originally Posted by madshi

Very good, thank you! I've de-rotated/de-mirrored your images and here they are for easy comparison:

no modification - | - rotated left - | - rotated right - | - mirrored horizontally

If you compare these images, you'll see that the texture changes a lot in all 4 images, it's completely different in each frame. It still has an overall similar look to it, but the changes are still much bigger than any dithering, so in motion this will look extremely noisy/unstable.

For still images it might not matter too much, but for video this type of "texture hallucination" is IMHO currently not feasible in motion, because it is not stable when the image content changes slightly. I'm not sure if the algorithm could be changed to fix this problem. I kind of doubt it because the algo by design doesn't even try to restore the original texture (which is technically impossible, anyway), it just tries to hallucinate a texture which hopefully has a similar look to the texture the original hi-res image had before downscaling. So the algo is by design not able to maintain a stable "position" of the texture in motion.

Even worse, if you look at the very bottom of the image, the alphalt texture is changing its brightness very strongly from frame to frame, this will actually produce visible flickering in motion. That said, these brightness fluctuations should be fixable with better neural network training.

the loss function defined in SRGAN has 2 sections, content loss and adversarial loss, the content loss is defined as a perceptual loss which is a high level VGG feature loss rather than pixel loss (SAD/MSE), the adversarial loss would try to make the reconstructed image look as close to a native high res image in general as possible and probably 80% of the magic comes from this section, so this section stays put, now a different content loss function would not affect the "hi res" magic much, but would determine the level of the "richness" of details in the generated image, and here u could remove the perceptual loss function and replace it with MSE, the perceptual loss function gives rich but unstable details while MSE gives blurry but stable details, adversarial loss paired with MSE would give u a slightly blurry and stable result that still looks native high resolution in general, I guess u could try with this and see if it works out alright