Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 29th June 2019, 16:21   #1761  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 123
I noticed that several talks mentioned rav1e, but none directly covered it

Was I missing a video, or did the Mozilla/Xiph rav1e guys not get a talk at BAV?
soresu is offline   Reply With Quote
Old 30th June 2019, 12:23   #1762  |  Link
birdie
.
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 144
Twitch's AV1 deployment roadmap (from Big Apple Video 2019)

birdie is offline   Reply With Quote
Old 1st July 2019, 18:33   #1763  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 66
Quote:
Originally Posted by soresu View Post
I noticed that several talks mentioned rav1e, but none directly covered it

Was I missing a video, or did the Mozilla/Xiph rav1e guys not get a talk at BAV?
I believe that because the conference was organized by Vimeo/Mozilla, who just announced a partnership around rav1e, they wanted to prevent a potential conflict of interest in talk selection and decided to not give a talk on it.
Beelzebubu is offline   Reply With Quote
Old 1st July 2019, 18:41   #1764  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 66
Quote:
Originally Posted by soresu View Post
Having watched some (but not all) the BAV presentations - I know that AV1 is currently not ideal for running a battery of tests at short notice, but did they really need to use such outdated versions of competing codecs?

Im pretty sure that the x265 build was from January, and the libaom build from february in one of them.

Maybe I'm missing something and those builds were picked for stability?
The builds used in the were:
  • rav1e c68d68c6fa80dabf5e4ed9b379f090572eb43d96 (Mon Jun 3 2019)
  • libaom a385cc44e15833f56de45bbbc1cc6c474751ac9f (Wed Apr 24 2019)
  • x264 5493be84cdccecee613236c31b1e3227681ce428 (Thu Mar 14 2019)
  • x265 12522:10decf67c077 (Fri Jun 07 2019)
  • SVT-AV1 6fd564611bdb48a2a6d2c7b90a91b4b1bdbe74b9 (Mon Jun 10 2019)
  • libvpx f836d8ba87dcba437228580fe65afe151ccf7659 (Thu Apr 25 2019)

So basically - ignoring x264 for a second (which is pretty mature/stable) - some from late April and some from early June, none from January or February.
Beelzebubu is offline   Reply With Quote
Old 1st July 2019, 20:30   #1765  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 123
Ah, must have misread the presentation then, easier for me to read from slides than video - still the position of rav1e seems odd, it shows on graphs to be still hovering around x264 - I could have sworn it passed x264 months ago, and then nothing has been said since despite all the work commits being merged into it.

Is it still suffering that regression from awhile ago?

"I believe that because the conference was organized by Vimeo/Mozilla, who just announced a partnership around rav1e, they wanted to prevent a potential conflict of interest in talk selection and decided to not give a talk on it."

Yes that makes sense, just seemed a little odd, like going to WWDC and getting nothing from Apple - still they got a lot of mentions from the presenters in any case.

Last edited by soresu; 1st July 2019 at 20:40.
soresu is offline   Reply With Quote
Old 1st July 2019, 22:55   #1766  |  Link
TD-Linux
Registered User
 
Join Date: Aug 2015
Posts: 34
The rav1e stats are correct. We still fall behind on high bitrate VMAF - most likely due to that being more sensitive to activity masking (aq) which is still in progress by s_p.

The multithreading performance is limited by a serialization point of the loop filters between frames (by far the slowest part of rav1e right now). There's some outstanding PRs to make it better, e.g. https://github.com/xiph/rav1e/pull/1396
TD-Linux is offline   Reply With Quote
Old 1st July 2019, 23:02   #1767  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 190
The Visionular talk from Zoe Liu used builds from Jan and Feb.

I think the x265 release used was the last stable release so that doesn't seem too crazy. You could easily cry foul if someone used a non stable git commit and it performed worse than expected due to hitting a bug.

It's also worth bearing in mind that that was basically the same talk as given at the Agora.io thing, so some people tour these things around for a while, it's not ridiculous for them to reuse slides and not have something fresh for every talk they give. Fairly certain I'd seen the NGCodec talk slides before too.
dapperdan is offline   Reply With Quote
Old 2nd July 2019, 18:53   #1768  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by Beelzebubu View Post
The builds used:
  • rav1e c68d68c6fa80dabf5e4ed9b379f090572eb43d96 (Mon Jun 3 2019)
  • libaom a385cc44e15833f56de45bbbc1cc6c474751ac9f (Wed Apr 24 2019)
  • x264 5493be84cdccecee613236c31b1e3227681ce428 (Thu Mar 14 2019)
  • x265 12522:10decf67c077 (Fri Jun 07 2019)
  • SVT-AV1 6fd564611bdb48a2a6d2c7b90a91b4b1bdbe74b9 (Mon Jun 10 2019)
  • libvpx f836d8ba87dcba437228580fe65afe151ccf7659 (Thu Apr 25 2019)
Are the actual command line parameters used documented somewhere?
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 19:11   #1769  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by mandarinka View Post
I wonder how much does VMAF really speak about visual quality and compression efficiency while keeping detail (as opposed to the usual issue with metrics, the "blur more for maximum PSNR/SSIM" effect), seeing how in those slides, *everything* except Rav1e and x264 is shown as matching or outdoing x265. Well, I guess there's already the usual assertion/claim that x265 = lipvpx-vp9 that raises questions. I always stop wondering at that point in these presentations...
VMAF is the least-bad objective metric we've ever had, but it's still far from perfect.

Also, VMAF isn't static. Netflix comes out with new ML models that will give different (and more accurate) scores compared to older models. It can be estimated for mobile, 1080p, or UHD devices. The scores vary based on the resolution of comparison (720p tested at 720p will deliver higher scores than 720p tested at 1080p, compared to 1080p encoding). And it is a per-frame metric, and how to aggregate per-frame scores into an overall clip quality is an unanswered question. Using a harmonic mean helps, but even that is probably only useful for <20 second durations. A single VMAF score for a whole movie or episode could indicate highly variable quality or highly consistent quality.

None of this is a diss on VMAF. Netflix did what they set out to do well, put a huge amount of effort into it, and made reasonable design decisions. But like all metrics, it measures what it is designed to measure, not what we wish it measured .

I've seen VMAF do a poor job of detecting:
  • Banding in gradients
  • Detail in low luma
  • Adaptive quantization improvements
  • Artifacts in encoders/formats that weren't included in the VMAF training set
  • Differences between two pretty high quality encodes.

And it doesn't do HDR at all. It used to not do UHD, but does now.

Another problem with a popular metric is that developers start tuning for that metric instead of what the metric is supposed to measure (subjective quality in this case). When developers start tuning for metrics over eyes, the correlation of that metric to subjective quality actually gets WORSE. So, for an encoder like libaom that got tuning based on VMAF ratings, we'd expect that its VMAF scores would be higher relative to actual subjective quality than for encoders that weren't tuned that were. But it'll be better than ones tuned for PSNR, like the vp? series.

Not that tuning for VMAF is a bad strategy. But it does result in less meaningful VMAF scores.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 19:17   #1770  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by dapperdan View Post
My theory is that the people hired for the subjective tests that underly the objective stats or that vote VP9 as very slightly better than x265 in the MSU tests on subjectify.us have a different notion of quality than the kind of person who is interested in codecs for their own sake.
These are double-blind tests; the people doing it just compare two encodes.

That said, I've not seen any study demonstrating better subjective quality from a well-tuned libpvx encode versus a well-tuned x265 encode, using the same bitrate @ time.

Quote:
Like, I read a paper recently where someone was applying their grain synthesis approach to HEVC and the subjective tests they did to prove it worked showed they could get basically all the subjective benefit by just doing the noise removal step and not bothering to add the grain back in, something that could be done by any encoder, for any codec (and I'm guessing this makes up part of the secret sauce of some encoders).
This opens up the interesting question of no-reference quality versus creative intent. Someone just looking at a clip without grain might think it looks great. But if the creators meant there to be grain, than the output isn't accurate. That's something that some studies might not rate. And if customers dislike grain, they might rate the encoded version higher than the source!

Quote:
But I guess someone who said they could get a massive increase in subjective quality via the Psy optimisation of basically blurring the input would get some pushback on that view in some quarters, even with subjective tests to back it up.
Well, that is what adaptive quantization is all about, really. Put the artifacts where they are less painful, and used the saved bits where they'll provide the most visible improvements.

There are similar debates about TV's default "vivid" mode. Some people claim to like it, even though what's displayed in manifestly wrong on many axes.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 19:17   #1771  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by Blue_MiSfit View Post
Great talk from Ronald. If only I had the time to do an evaluation of Eve_AV1
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 19:19   #1772  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by dapperdan View Post
I think the x265 release used was the last stable release so that doesn't seem too crazy. You could easily cry foul if someone used a non stable git commit and it performed worse than expected due to hitting a bug.
And there weren't any substantial quality improvements inx x265 between the Jan 2019 builds before the June 3.1 release.

How the encoders got tuned is what matters. And if quality is being compared at fixed encoding time, performance improvements become quality improvements.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 20:31   #1773  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 190
Quote:
Originally Posted by benwaggoner View Post
These are double-blind tests; the people doing it just compare two encodes.
My point still stands for tests that intend to be double-blind since some people's abilities and/or preferences would effectively unblind the test.

Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.

On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.

I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.

I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.
dapperdan is offline   Reply With Quote
Old 2nd July 2019, 20:54   #1774  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 190
Quote:
Originally Posted by benwaggoner View Post
That said, I've not seen any study demonstrating better subjective quality from a well-tuned libpvx encode versus a well-tuned x265 encode, using the same bitrate @ time.
Have you read the full version of the last MSU subjective comparison? I've only read the free snippet, which doesn't have enough info to say either way, but it's possible that fits the criteria or is at least in the right ballpark, potentially a statistical tie:

http://www.compression.ru/video/code...jective_report

On the other hand, similar to how complaints about electric cars are now "I don't like the minimalism of their touchscreen interfaces" when not too long ago you'd hear how they were physical impossibilities, I think the fact that we're now at this level of complaint for the previous generation of royalty-free codecs is a testament to how far we've come.
dapperdan is offline   Reply With Quote
Old 2nd July 2019, 22:00   #1775  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 123
Quote:
Originally Posted by benwaggoner View Post
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.
soresu is offline   Reply With Quote
Old 2nd July 2019, 22:39   #1776  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,575
Re: Eve evaluation, I've never tried, TBH. I've been wanting to spend more time looking at Beamr 5x (fantastic so far!) but have been quite busy.
Blue_MiSfit is offline   Reply With Quote
Old 3rd July 2019, 02:30   #1777  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by dapperdan View Post
Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.
Yep, and then we're back into "Zen and the Art of Motorcycle Maintenance" style philosophical ruminations on the nature and meaning of "quality." Which is unavoidable at a certain point, which is why we try to test for something more specific than just quality. Accuracy to a source and creative intent can be quite different from a no-reference "is this clip pleasing" or "do you see anything wrong with this clip?"

Quote:
On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.
Exactly. And it's not a particularly hard thing to synthesize. It's not like the film grain in Marvel movies is ACTUALLY grain-from-film. It's digitally synthesized. Grain helps make blending in VFX a lot easier, and allows for rendering at 2K instead of 4K.

Quote:
I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.
It depends on what the question they were asking was intended to be. But yeah, having a bunch of video engineers look at something is very different than having the general public look at something, and can provide different (but both useful!) answers. Video engineers are going to pick up on more subtle things, and are going to care about accuracy and creative intent more.

Generally I'll have video experts to an initial pass on something to see "is there something that can be seen here?" and then using double-blind testing with a more general population to confirm details. The second is a LOT slower and more expensive than the first, of course.

Quote:
I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.
Oh, no doubt. I've done that party trick **many** times. x264 versus WMV3 versus VC-1 versus Main Concept versus VP9; it's generally pretty obvious if you've been in the field for a while.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 3rd July 2019, 14:45   #1778  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 66
Quote:
Originally Posted by benwaggoner View Post
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
Have you asked?
Beelzebubu is offline   Reply With Quote
Old 3rd July 2019, 23:55   #1779  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
Quote:
Originally Posted by soresu View Post
Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.
To be clear, I speak only for myself, not Amazon, on these forums. I actually tinker with video stuff on my off hours too. I should probably get out more.

Anyway, I requested an optimal Eve encoding for My encoding challenge, but they declined to participate.

It is common for encoder vendors who think they are doing some magic things in the bitstream to want to have the bitstream output under NDA and such. I get the impulse, but it just isn't practical for doing actual comparisons or due diligence evaluation.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book

Last edited by benwaggoner; 3rd July 2019 at 23:59. Reason: More details about Eve offer
benwaggoner is offline   Reply With Quote
Old 4th July 2019, 09:12   #1780  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 190
When you say they "declined to participate" did they respond and say they didn't want to take part or did you just not hear from them after making a broad request in a forum post?

I believe the comment above yours saying ("Have you asked?") Is written by a developer of EVE, which suggests they didn't know they'd been asked, so possibly an email has got lost in a spam trap.
dapperdan is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:41.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.