Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 8th December 2018, 11:13   #1261  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,340
He can ramble about hardware influence all he wants, but if you don't get the hardware people onboard, your codec is DOA anyway.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 8th December 2018 at 11:16.
nevcairiel is offline   Reply With Quote
Old 8th December 2018, 15:34   #1262  |  Link
utack
Registered User
 
Join Date: Apr 2018
Posts: 63
Quote:
Originally Posted by mandarinka View Post
I can't tell how correct it is, but this was an interesting read: https://codecs.multimedia.cx/2018/12...cal-about-av1/
Author is a former libav/ffmpeg developer if you don't remember his name.
That the daala people who provided most of the new ideas startet their own rav1e encoder from scratch is supporting this blog post.
I also don't get how hardware seems to play a big role, VP9 was not made with hardware in mind, and Qualcomm Samsung Nvidia AMD Intel, as well as some random Chinese SOC vendors still managed to make a hardware decoder. So hardware designers can't be scared off that easily it seems?
utack is offline   Reply With Quote
Old 8th December 2018, 17:11   #1263  |  Link
Mjpeg
Registered User
 
Join Date: Jun 2018
Posts: 7
Quote:
Originally Posted by nevcairiel View Post
He can ramble about hardware influence all he wants, but if you don't get the hardware people onboard, your codec is DOA anyway.
I agree. I'm just a lurker, but the HEVC licensing debacle opened up a window for a few years, so AV1 needed to jump in quickly to have a chance, which leads to the not-so-radical design that annoys him. I think what he misses is that if AV1 is can succeed, then we'll get AV2, AV3 etc.

It's super clear to me that hardware is crucial because of playback. A codec cannot succeed if "OMG Youtube Is Killing My Battery" is a big reddit thread!
Mjpeg is offline   Reply With Quote
Old 8th December 2018, 17:39   #1264  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
Quote:
Originally Posted by utack View Post
That the daala people who provided most of the new ideas startet their own rav1e encoder from scratch is supporting this blog post.
rav1e was started to provide a minimal and fast encoder as early as possible, it was not some kind of statement.
There are even some slides about why working on the existing libvpx-then-branched-libaom codebase made things hard in some xiph.org user folder, I'll edit if I can find them again.
It mainly had to do with libaom being big, old and full of experiments anyway
---
Alright found a couple right off the bat:
https://people.xiph.org/~tdaede/rav1e_vdd_2017.pdf
Code:
Started as a reimplementation of AV1 in order to find bitstream and specification bugs
● Could do an encoder or decoder:
  – Decoder (especially fuzzed) more useful to find mismatches
  – Encoder doesn’t need all features implemented to work
● Algorithmic improvements over libaom
https://people.xiph.org/~tdaede/rav1e_vdd_2018.pdf
Code:
Background on libaom:
Derived from libvpx codebase
● Reference implementation, “sort of usable”
● Much encoder behavior is inherited from previous VPx codecs
  – multiple frame passes
  – weird rate control
As for hardware, we can speculate how Google had to drag some vendors on board with who knows what promises or deals as a last ditch effort to promote VP9. It was not designed with hardware in mind, hardware support came very late, and whoa look at the amount of people willing to make VP9 encodes around outside of Google/Youtube! /s

One would think they learnt from that experience. Again, we can only speculate, but that would provide a good explanation for the veto power of hardware vendors this time around.

Quote:
Originally Posted by nevcairiel View Post
That may be true, but you can make the same argument for a lot of things. That alone does not necessarily justify a feature thats generally rather annoying, and generally considered a remnant of the past.
But it's a matter of fact that we're stuck with it. Considering it's not that hard to understand how to use it from an encoder (the person, not the software) POV and the low overhead, discouraging its use will only have people creating streams that can't be reproduced on a very large amount of systems. Next thing you know, plenty of people will be screaming bloody murder because their 16 core system stutters on a 1080p clip with mild to heavy motion and the CPU is still underused.

What are the modern alternatives to tiling? I remember reading somewhere WPP is patented, so what would be the next best alternative?

Last edited by SmilingWolf; 9th December 2018 at 10:17.
SmilingWolf is offline   Reply With Quote
Old 9th December 2018, 11:37   #1265  |  Link
alex1399
Registered User
 
Join Date: Jun 2018
Posts: 56
Oh no, Youtube coupled the high define resolution with the 60fps in all most every video. It uses a trivial motion interpolation that simply duplicates frame from previous and converts native frame-rate into 60fps. What a jerk. Thats why it works so hard on some low end PC with high speed Internet.
alex1399 is offline   Reply With Quote
Old 9th December 2018, 22:23   #1266  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
It seems that at some point YouTube actually implemented AV1 "in the wild". Much like how VP9 was rolled out, it seems that the larger the view count then the more likely you'll get an AV1 encode...but not entirely - this is most obvious on Linus Tech Tips where this LTT video with ~1.6M views has AV1 but this other LTT video posted just 1 day before with ~2.8M views does not...

Unless YouTube flipped an AV1 switch sometime specifically on December 3rd only for newly-posted video, making the first-linked video quality while the second-linked video did not?




Anyway, I just finished a bunch of CPU decoding performance testing for one of SmilingWolf personal videos that he encoded in AV1, and while the absolute performance numbers are relatively meaningless since they largely depend on things like bitrate, framerate, and resolution, the relative decoding performance numbers should still be of interest.

The way I measured included (but was not limited to) having a video clip play at 17fps and then seeing what the lowest clockrate required was for a given CPU architecture and thread configuration with a 45nm Core 2 Duo @ 3.5GHz as the baseline. I also tested within a single architecture for performance scaling at various clockrate, and save for Wolfdale (possibly due to the lack of an integrated memory controller), the performance scaling was for all intents and purposes identical between Nehalem and Haswell (that is, the percentage amount of extra clockrate necessary to play back 24fps vs 17fps was darned-near exactly the same)

All of this was tested with MPC-HC 1.8.3 x64 (LAVfilters 0.73, libaom) and only on CPUs that supported SSE 4.1 as CPUs lacking this instruction set would have needed a 10+ GHz overclock (I'm not kidding) such as the Phenom II and the 65nm Conroe-based Core 2 Duo (even though the later supports SSSE3 and not just SSE3).

And to clarify, the percentage below is simply how much faster a given CPU should be if it had the same clockrate as the baseline 2c/2t Wolfdale (which itself was clocked at 3.5GHz).
  • 100% - Wolfdale 2c/2t
  • 119% - Nehalem 2c/2t
  • 130% - Wolfdale 4c/4t
  • 138% - Nehalem 2c/4t
  • 167% - Haswell 2c/2c
  • 175% - Nehalem 4c/4t
  • 175% - Nehalem 4c/8t (not a typo)

And from some of the other tests I did, I was able to extrapolate the performance of Haswell CPUs configured at 2c/4t, 4c/4t, and 4c/8t (again, relative to a 2c/2t Wolfdale) as I do not have access to Haswell CPUs with thread configurations greater than 2c/2t:
  • 199% - Haswell 2c/4t
  • 241% - Haswell 4c/4t
  • 241% - Haswell 4c/8t (not a typo)


For those that don't know their CPU architectures...
  • Wolfdale = second generation desktop Core 2 Duo/Quad, 45nm die-shrink
  • Nehalem = 45nm, first generation of Intel CPUs that use the Core i5/i7 branding (i3 didn't come along until the 32nm Westmere die-shrink...which was still considered "1st gen" oddly enough)
  • Haswell = 22nm, fourth generation of Intel CPUs that use the Core i3/i5/i7 branding
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       

Last edited by Nintendo Maniac 64; 3rd January 2020 at 23:47.
Nintendo Maniac 64 is offline   Reply With Quote
Old 10th December 2018, 00:18   #1267  |  Link
Zebulon84
Registered User
 
Join Date: Apr 2015
Posts: 21
Quote:
Originally Posted by Nintendo Maniac 64 View Post
It seems that at some point YouTube actually implemented AV1 "in the wild". Much like how VP9 was rolled out, it seems that the larger the view count then the more likely you'll get an AV1 encode...but not entirely - this is most obvious on Linus Tech Tips where this LTT video with ~1.6M views has AV1 but this other LTT video posted just 1 day before with ~2.8M views does not...

Unless YouTube flipped an AV1 switch sometime specifically on December 3rd only for newly-posted video, making the first-linked video quality while the second-linked video did not?
The one with more views without AV1 is also longer (18 min vs 11 min), so it may be still encoding, or length is taken into account when choosing which video is worth spending hours of encoding time for a few AV1 views.

Thanks for the benchmarks, it proves it's possible to do software decoding of AV1 on desktop, but it's quite heavy, and will be hard on laptop battery and probably too much for any smartphones. Do you plan to add ...lake, Ryzen or dav1d ?
Zebulon84 is offline   Reply With Quote
Old 10th December 2018, 01:03   #1268  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
Quote:
Originally Posted by Zebulon84 View Post
it proves it's possible to do software decoding of AV1 on desktop, but it's quite heavy, and will be hard on laptop battery and probably too much for any smartphones
Remember that this depends heavily on video resolution, framerate, and bitrate.

In particular, the video I was using I had manually slowed down to 17fps because that was the highest framerate my Nehalem x3470 could handle when configured as 2c/2t and turbo disabled (2.93GHz), which is why I did not provide any absolute performance numbers and focused on relative performance.



Quote:
Originally Posted by Zebulon84 View Post
Do you plan to add ...lake, Ryzen or dav1d ?
I don't have any such CPUs, and I've no idea how to benchmark dav1d since coding and such is totally not my specialty (my expertise is much more in hardware).

However, Zen-based CPUs should have per-GHz performance similar to Haswell while Sky/Kaby/Coffee lake will have slightly better per-GHz performance than Haswell, so for the most part you can just use the Haswell relative performance numbers (not the clockrate!) as a reference for those architectures.
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       
Nintendo Maniac 64 is offline   Reply With Quote
Old 11th December 2018, 17:11   #1269  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,120
dav1d v0.1 has been released: http://www.jbkempf.com/blog/post/201...lease-of-dav1d
hajj_3 is offline   Reply With Quote
Old 11th December 2018, 17:52   #1270  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Quote:
And, we've been experimenting with shaders, notably for the Film Grain feature.
shaders = GPU?
sneaker_ger is offline   Reply With Quote
Old 11th December 2018, 17:52   #1271  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,959
Quote:
Originally Posted by hajj_3 View Post
Good news.
I will wait for the ffmpeg build with both libaom and libda1d libraries. I want to compare the speed of work in the same conditions.
v0lt is offline   Reply With Quote
Old 11th December 2018, 20:07   #1272  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
64bits, GCC 8.2:
ffmpeg 4.2-92673-g876ed08b0d: https://mega.nz/#!QxpinIyQ!HBtUEzFOb...NxABGMlPcvgSWA
- libaom 1.0.0-1024-g5b8f393fe
- libdav1d 0.1.0 c0501f1
SmilingWolf is offline   Reply With Quote
Old 11th December 2018, 20:48   #1273  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Thx. Seems dav1d is still easily 40% slower on an i5-2500K (AVX but no AVX2) compared to libaom. Only ~50% CPU utilization on both.
sneaker_ger is offline   Reply With Quote
Old 11th December 2018, 21:14   #1274  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,340
SSE* code is still actively being worked on, and is actively coming in right now, so its getting faster day by day on those systems. AVX1 doesn't help a codec like this much, since AVX instructions are primarily floating point, and only AVX2 adds the required integer instructions.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 11th December 2018, 21:25   #1275  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
To follow the progress of SSSE3 implementation: https://code.videolan.org/videolan/dav1d/issues/216
Same thing for NEON: https://code.videolan.org/videolan/dav1d/issues/215

An article on dav1d 0.1.0 by the same guy who's been doing most of the benchmarks that appeared in the official blogposts: https://medium.com/@ewoutterhoeven/d...s-5404360e44e3

Last edited by SmilingWolf; 11th December 2018 at 21:31.
SmilingWolf is offline   Reply With Quote
Old 12th December 2018, 04:03   #1276  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,959
Quote:
Originally Posted by nevcairiel View Post
SSE* code is still actively being worked on, and is actively coming in right now, so its getting faster day by day on those systems. AVX1 doesn't help a codec like this much, since AVX instructions are primarily floating point, and only AVX2 adds the required integer instructions.
They claim that dav1d is always faster than libaom. They say that there are problems only in single-threaded mode. This lie breaks.

Quote:
On modern desktop, dav1d is very fast, compared to other decoders:
Pentium G5600 not modern?
Quote:
But, since the previous blogpost, we've added more assembly for desktop, and we've merged some assembly for ARMv8, and for older machines (SSSE3).

We're now as fast as libaom, in single-thread, on ARMv8, and faster with more threads.
My tests:
Code:
ffmpeg -t 10 -c:v libaom-av1 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -t 10 -c:v libdav1d -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -t 10 -c:v libdav1d -threads 4 -tilethreads 4 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
Result:
Code:
libaom-av1 - 14 fps
libdav1d - max 7.1 fps
libdav1d -threads 4 -tilethreads 4 - max 9.6 fps
I got the exact same result a month ago.
v0lt is offline   Reply With Quote
Old 12th December 2018, 04:45   #1277  |  Link
Wolfberry
Helenium(Easter)
 
Wolfberry's Avatar
 
Join Date: Aug 2017
Location: Hsinchu, Taiwan
Posts: 99
This commit may help.

Not sure if it is CLI only or can be used in ffmpeg.
__________________
Monochrome Anomaly
Wolfberry is offline   Reply With Quote
Old 12th December 2018, 05:04   #1278  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
Quote:
Originally Posted by v0lt View Post
Pentium G5600 not modern?.
Unfortunately, many people do not realize that Intel Pentiums do not support AVX at all.

At least going forward there's now an AMD alternative in the form of the Athlon 200GE which does support AVX2 (in addition to having a better iGPU and actual sane prices in lieu of Intel's 14nm shortage), but that processor was only just released a couple months ago.
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       
Nintendo Maniac 64 is offline   Reply With Quote
Old 12th December 2018, 05:48   #1279  |  Link
MoSal
Registered User
 
Join Date: Jun 2013
Posts: 95
Quote:
Originally Posted by v0lt View Post
Code:
libaom-av1 - 14 fps
libdav1d - max 7.1 fps
libdav1d -threads 4 -tilethreads 4 - max 9.6 fps
I got the exact same result a month ago.
Can you try -threads 8 -tilethreads 1?
__________________
https://github.com/MoSal
MoSal is offline   Reply With Quote
Old 12th December 2018, 10:55   #1280  |  Link
marcomsousa
Registered User
 
Join Date: Jul 2018
Posts: 80
Quote:
ffmpeg-4.2-92681-0e833f6
- libaom 1.0.0-1028-78e6b2c
- libdav1d 0.1.0 73067e5

ffmpeg -t 10 -c:v libaom-av1 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -t 10 -c:v libdav1d -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -t 10 -c:v libdav1d -threads 4 -tilethreads 4 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
Result:
Code:
Quote:
libaom-av1 - 21 fps 0.780x speed
libdav1d - 41 fps 1.65x speed
libdav1d -threads 4 -tilethreads 4 - 58 fps 2.31x speed
CPU: Intel i7 8550U (MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, EM64T, AES, AVX, AVX2, FMA3)
__________________
AV1 win64 VS2019 builds
Last build here
marcomsousa is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:42.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.