Quote:
Originally Posted by madshi
Strange thing, never had any trouble with L1 yet. Of course L2 helps achieve better PSNR, which might be beneficial for your paper. L1 usually looks better to my eyes, though. You could try Huber or Charbonnier loss as alternatives. Or you could try something like 10% L2 and 90% L1. Maybe those 10% L2 could help "guiding" L1 with your net?
|
I found logcosh loss (defined as "ln(cosh(error))") generally works the best for my neural net after a few experiments, it has the following mathematical properties:
so it shares similar behaviors with Huber loss but it's a bit better cuz it's analytic, and any order of its derivative is also analytic, it's prettier cuz it got no non-differentiable singularities and Huber loss does (edit: the Huber loss itself is differentiable but its derivate is not, it's not differentiable for higher orders)