Doom9's Forum - View Single Post

benwaggoner · 17th October 2018, 19:51

Quote:

Originally Posted by dapperdan

Netflix already use VP9, which is lacking good rate control and psychovisual tuning, but still delivering higher quality (VMAF) at lower bitrates than their other streams (according to their blog posts) and they have said they'd roll out AV1 when it was within 4-10x slower than VP9. Presumably based on getting higher quality that makes that tradeoff worthwhile for them at that point.

VMAF has a much better subjective correlation than SSIM, but it is pretty far from perfectly subjectively correlated with double-blind subjective measurements. The risk of any new metric is over-optimizing an encoder for metric scores instead of actual subjective experience. I saw plenty of places where it got things wrong in the previous version; I've not worked extensively with the latest update, though, which has doubtless improved.

Netflix is also going from H.264 to VP9/AV1, so they don't need to beat or even meet HEVC quality to get a worthwhile improvement, of course.

Netflix IS using HEVC for all UHD and HDR encoding AFAIK.

I'm not sure where VP9 versus H.264 quality stands today. I'm running an encoding challenge, and would love to have someone provide best-effort VP9 and even AV1 samples for comparison.

https://forum.doom9.org/showthread.php?t=175776

Quote:

Bitmovin claims their latest encoder release is twice as fast as stock AV1 and 20x slower than VP9 so things don't seem to be that far off AV1 being delivered on a large scale.

That would suggest that stock AV1 is only 40x slower than VP9, which is not my understanding at all. Perhaps they are talking the highest speed mode of their encoder, which would involve some quality degradation. Quality @ Perf is the important thing.

Getting good multithreading into AV1 is going to be really important, since time to market matters for a lot of content. Being able to encode something 8x faster on 16 cores probably doesn't matter for Netflix or YouTube given their chunking and lack of day-after-broadcast content. But it's critical for other markets, and essential for live encoding. Some of the VP9 and AV1 comparisons with x264 and x265 were artificially limited to 1-2 cores. Which makes sense if optimizing for absolute volume of minutes encoded, but understates the speed advantages of x26? for latency-critical tasks.

Quote:

They also have the extra bonus of being able to encode their in-house stuff without film grain and add it later, a feature they were keen to have added to AV1.

That is a pretty huge feature! It was optional for H.264 (only required in HD-DVD decoders, but I don't know of anything authored with it). Random noise is mathematically uncompressible, so this kind of noise synthesis is an extremely promising way to improve quality and compressibility of the most challenging content.

...and will also reveal how much of the apparent detail in film comes from the grain. Older films, particularly, look really soft without the grain, and often just don't have much spatial detail.

But don't underestimate the challenge of the removing grain part; parameterizing it and then reconstructing it on playback are the easy parts. It's way more feasible now than 12 years ago, but it isn't trivial of something that can run 100% automated without messing up sometimes.

Unfortunately production workflows put in film grain much earlier than the encoding stage, so it's already baked in way before it gets to an encoder.