View Single Post
Old 23rd May 2018, 09:12   #67  |  Link
jonatans
Registered User
 
Join Date: Oct 2017
Posts: 56
Quote:
Originally Posted by benwaggoner View Post
SSIM and especially PSNR are not very good metrics, particularly with advanced new codecs. Or anything that is psychovisually tuned.

VMAF is probably the least-bad objective metric we have, but still has some pretty big limitations and blind spots. It wasn't tested with anything >1080p or below something like 300 Kbps. SDR 8-bit only. A quite weak temporal comparison module. And it was trained on just x264, and so doesn't know what to do with new types of artifacts AV1 and HEVC can have.

Plus there's the whole question of how you aggragate individual frame scores into a clip score. Just the mean of the metrics can't discriminate between content that is consistently mediocre versus oscillating between terrible and pristine.

With a new codec, actually looking at it is really the only thing that can give a better than a rough ballpark. Certainly a difference of less than 5 VMAF, 4 PSDR dB, and 3 SSIM dB should be verified visually.
Thanks Ben, those are all very valid points.

I agree that PSNR is not a very good metric for determining visual quality. It is good for determining how close the compressed pictures are to the original pictures (in a mathematical sense - after all that is what PSNR measures, on a sample by sample basis). So if two different implementations are tuned to minimize MSE ("tune PSNR") then PSNR gives a good indication of how good the implementations are at doing just that. And even though this generally doesn't correlate well with visual quality it is good in the sense that it does not give any "false positives" (i.e. if one compressed picture A is tuned towards PSNR you cannot easily create another compressed picture B with higher PSNR by making it look worse than A).

We have put some results for the NETVC test conditions and test sequences at awcy.divideon.com, comparing xvc to AV1, both codecs with PSNR tuning (https://awcy.divideon.com/?job=1pass...3A34%3A38.388Z) including results for PSNR (16.5% savings) MS-SSIM (23.3% savings) and VMAF (18.5% savings). It can be seen that the gains are quite sequence dependent but the overall trend is very clear and quite consistent among the metrics.

There is also a comparison between xvc and HM (https://awcy.divideon.com/?job=1pass...3A34%3A38.388Z) which shows slightly larger bitrate savings.

But, as you correctly point out, what matters in the end is the visual quality which can only be determined by actually looking at compressed video (encoded under fair conditions when it comes to complexity etc.)

Please share your impressions if you have had a chance to look at any xvc encoded sequences or made any visual comparisons.
__________________
Jonatan Samuelsson
Co-founder and CEO at Divideon

www.divideon.com | xvc.io
jonatans is offline   Reply With Quote