Quote:
Originally Posted by WorBry
I've been testing the Vapour Synth VMAF (r3) plugin with a high quality 1080/50p source (CrowdRun, lossless x264 8bit 420 Intra) encoded to x264 over a range of CRF values.
|
Interesting results....having not tested VMAF before.
Here I encoded the CrowdRun 1080/50p 'master' to x264 over CRF 0 - 30. This was using the default vmaf_v0.6.1.pkl model (i.e. Predict Quality on a 1080p HDTV screen at distance 3x the screen height). The VMAF, SSIM and MS-SSIM scores are the aggregate values. The 'classic' SSIM tests were run on Zeranoe ffmpeg win64-static nightly build (20190131).
Big difference in the libvmaf SSIM and ffmpeg SSIM scores. Apparently, the vmaf SSIM implementation "includes an empirical downsampling process, as described at the Suggested Usage section of
https://ece.uwaterloo.ca/~z70wang/research/ssim/", whereas the FFMPEG implementation does not have this step:
https://github.com/Netflix/vmaf/issues/22
As for the VMAF metric itself; well, I can appreciate it's value in context of 'perceptual quality'. In this example it effectively declares the x264 transcodes to be visually lossless from CRF 0 to around CRF 16, whereas the ffmpeg-SSIM scores show a progressive decline over the entire CRF/bitrate range.
And here I ran a parallel series encoded to x265 for comparison.
Clearly VMAF judges x265 to have significantly higher perceptual quality than x264 at the lower bitrate range and more so than revealed by SSIM.
That said, I think 'classic' (ffmpeg) SSIM is still a useful tool for analyzing fine differences at the pixel peeping level and beyond visual acuity, and (by virtue of the differential Y, U and V scores) for determining whether the luma and/or chroma are affected.
I did record the libvmaf and ffmpeg PSNR scores also, but they are not as interesting.
@HolyWu, btw, thanks for the plugin.