Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd December 2018, 01:26   #1341  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
GCC 8.2.1 20181221, static build
ffmpeg-4.2-92779-g8b53d1322f: https://mega.nz/#!5kJxjA7I!dHKMhcYcj...yylOEHTDtESTYM
- libaom-av1 1.0.0-1103-g9a48f9ca5
- libdav1d 0.1.0-38-g1703f21
SmilingWolf is offline   Reply With Quote
Old 23rd December 2018, 13:22   #1342  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
My AVIF toolkit: https://mega.nz/#!5oQE2Sob!STZHdk4ob...u3WUKkg7iyh2EM

Needs MSYS2. It's mostly an hack job.
Don't mess with the directory structure. Images to convert to AVIF in "images", AVIF to convert to PNG in "contained".
Due to stuff right now the script assumes all the AVIF images follow the BT709 matrix for YUV -> RGB conversion.

Usage:
  • encode.MT.sh takes in input a quality for aomenc
  • encode.ST.sh takes in input an image path and a quality setting
  • decode.MT.sh takes no arguments
  • decode.ST.sh takes the AVIF file path

ST and MT are for Single Thread and Multi Thread, respectively. Would be more correct to say multi-process but you get the idea. The default is 6 processes, you can tweak it by modifying xargs' -P option in the MT script.

The defaults are for high quality (--cpu-used=0 for extra overkill), so it might take a while to convert everything depending on your machine. You can tweak aomenc's options in encode.ST.sh

Also included a python script to gather the metrics' weighted average from a given stats file (auto generated by encode.MT.sh)
SmilingWolf is offline   Reply With Quote
Old 23rd December 2018, 17:24   #1343  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
Quote:
Originally Posted by Wolfberry View Post
  • zscale (libzimg) should be preferred (personal opinion)
While I do use libzimg from time to time, it has a tendency to randomly crash (esp. when downscaling video), so I don't want it into an automated workflow.

For the mod2 thing: setting YUV444 just for that looks way too wasteful unless you already have to preserve zones of high contrast like in the DDMC.png example. So maybe something like scale=-2:ih to set the width to a multiple of 2?

Last edited by SmilingWolf; 23rd December 2018 at 18:37.
SmilingWolf is offline   Reply With Quote
Old 23rd December 2018, 17:47   #1344  |  Link
lvqcl
Registered User
 
Join Date: Aug 2015
Posts: 293
I decided to test how my old CPU (Intel Core2 Quad Q9300, SSE4.1) decodes AV1 using ffmpeg from SmilingWolf build.

Test video: https://www.youtube.com/watch?v=PiWyCQV52h0 , 1280x720p.

-c:v libaom-av1: 1.89x realtime, utime = 94 sec.
-c:v libdav1d: 1.30x realtime, utime = 158 sec.
-c:v libdav1d -threads 4 -tilethreads 4: 1.67x realtime, utime = 159 sec.
-c:v libdav1d -threads 8 -tilethreads 8: 1.74x realtime, utime = 163 sec.
-c:v libdav1d -threads 16 -tilethreads 16: 2.02x realtime, utime = 161 sec.

(I hava no idea what threads and tilethreads options do, so I just tested various values for them)

So, on my system dav1d requires ~160/94=1.7 times more CPU time than aom.
lvqcl is offline   Reply With Quote
Old 24th December 2018, 12:16   #1345  |  Link
MoSal
Registered User
 
Join Date: Jun 2013
Posts: 95
Quote:
Originally Posted by lvqcl View Post
(I hava no idea what threads and tilethreads options do, so I just tested various values for them)
Try with tilethreads set to 1.
__________________
https://github.com/MoSal
MoSal is offline   Reply With Quote
Old 24th December 2018, 18:01   #1346  |  Link
lvqcl
Registered User
 
Join Date: Aug 2015
Posts: 293
-c:v libdav1d: 1.31x realtime, utime = 156 sec.
-c:v libdav1d -threads 4 -tilethreads 1: 1.31x realtime, utime = 157 sec.
-c:v libdav1d -threads 8 -tilethreads 1: 1.61x realtime, utime = 158 sec.
-c:v libdav1d -threads 16 -tilethreads 1: 1.80x realtime, utime = 159 sec.
-c:v libdav1d -threads 32 -tilethreads 1: 1.98x realtime, utime = 160 sec.
lvqcl is offline   Reply With Quote
Old 24th December 2018, 18:22   #1347  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,959
@lvqcl
What is "utime"? This is not like decoding time.
v0lt is offline   Reply With Quote
Old 24th December 2018, 20:20   #1348  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
What's the progress of dav1d leveraging SSSE3 assembly optimizations ?

Are we still based on AVX2 only for dav1d ?
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 24th December 2018, 20:37   #1349  |  Link
lvqcl
Registered User
 
Join Date: Aug 2015
Posts: 293
ffmpeg prints something like this:
Quote:
bench: utime=178.762s stime=2.839s rtime=58.362s
IIUC:
utime = total time spent on user code (across all CPU cores)
stime = total time spent on system code
rtime = "real time" aka wall time

So: it took 58.362 seconds to decode a video, but CPU time spent on decoding was 178.762+2.839 sec.
That is, (178.762+2.839)/58.362 = 3.1 cores were active (on average) during decoding.
lvqcl is offline   Reply With Quote
Old 25th December 2018, 01:55   #1350  |  Link
Wolfberry
Helenium(Easter)
 
Wolfberry's Avatar
 
Join Date: Aug 2017
Location: Hsinchu, Taiwan
Posts: 99
@NikosD

SSSE3: issue #216 (7 / 28)

AVX2: issue #78 (9 / 52)
__________________
Monochrome Anomaly
Wolfberry is offline   Reply With Quote
Old 25th December 2018, 08:46   #1351  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
So, AVX2 is missing only 4:4:4 and SVC but SSSE3 is missing everything (almost) for 8bit.

Thank you!
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 25th December 2018, 19:43   #1352  |  Link
utack
Registered User
 
Join Date: Apr 2018
Posts: 63
Did a quick test for some typical "sent by phone video". Shot on a phone, 30s, medium resolution and bitrate
x264 crf 26 and "placebo" to get a bitrate estimate for medium-poor quality, libaom cpu-used=3 in 2pass mode to match the bitrate and compare
Turns out for this medium resolution (720p), and with a lot of "high frequency" motion (water waves and grass) x264 is still extremely competitive, and imho even beats libaom here in 1/3 screenshots
http://screenshotcomparison.com/comparison/126513
utack is offline   Reply With Quote
Old 29th December 2018, 11:28   #1353  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,120
It looks like some new SSSE3 optimisations for Dav1d have been submitted: https://code.videolan.org/videolan/d...720bcf4961aaba
hajj_3 is offline   Reply With Quote
Old 30th December 2018, 12:04   #1354  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,120
what AV1 decoder does the latest MPC-BE x64 use (v1.5.3 4246 beta)? When playing a 720p 25fps AV1 video it uses up to 61% cpu on my kabylake i3-7100u, using vlc player 3.0.5 (which uses dav1d) it uses up to 35% playing the same video.
hajj_3 is offline   Reply With Quote
Old 30th December 2018, 20:12   #1355  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,959
@hajj_3
MPC-BE used libaom git-v1.0.0-748-g8048e8c0b.
https://sourceforge.net/p/mpcbe/code...e/trunk/lib64/
v0lt is offline   Reply With Quote
Old 3rd January 2019, 09:16   #1356  |  Link
marcomsousa
Registered User
 
Join Date: Jul 2018
Posts: 80
Quote:
Originally Posted by v0lt View Post
@hajj_3
MPC-BE used libaom git-v1.0.0-748-g8048e8c0b.
https://sourceforge.net/p/mpcbe/code...e/trunk/lib64/
Just update to libaom git-v1.0.0-1116-g00c80e6b5 (3 hours ago) next build will be updated.
__________________
AV1 win64 VS2019 builds
Last build here
marcomsousa is offline   Reply With Quote
Old 4th January 2019, 20:45   #1357  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by utack View Post
Did a quick test for some typical "sent by phone video". Shot on a phone, 30s, medium resolution and bitrate
x264 crf 26 and "placebo" to get a bitrate estimate for medium-poor quality, libaom cpu-used=3 in 2pass mode to match the bitrate and compare
Turns out for this medium resolution (720p), and with a lot of "high frequency" motion (water waves and grass) x264 is still extremely competitive, and imho even beats libaom here
Water waves and grass are really hard to encode, and classic per-frame PNSR or SAD style optimization don't yield good results. There's a lot of psychovisual tuning to keep the motion looking natural without getting block-based basis pattern leaking in. And a lot of rate control to keep a part of a frame with that content looking good without sucking all the bits away from the rest of the frame and making them look bad.

That's the kind of stuff that comes from a mature encoder with lots of psychovisual tweaks. Which defines x264 in spades, and which x265 inherited a lot of. The real-world performance of those encoders has more to do with the foundational legacy of loving obsessive attention from quality @ bitrate obsessed video pirates than any particular underlying bitstream features. I bet an x262 could have outperformed any MPEG-2 encoder for anime DVDs, for example.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th January 2019, 23:13   #1358  |  Link
utack
Registered User
 
Join Date: Apr 2018
Posts: 63
Quote:
Originally Posted by benwaggoner View Post
Water waves and grass are really hard to encode, and classic per-frame PNSR or SAD style optimization don't yield good results. There's a lot of psychovisual tuning to keep the motion looking natural without getting block-based basis pattern leaking in. And a lot of rate control to keep a part of a frame with that content looking good without sucking all the bits away from the rest of the frame and making them look bad.

That's the kind of stuff that comes from a mature encoder with lots of psychovisual tweaks. Which defines x264 in spades, and which x265 inherited a lot of. The real-world performance of those encoders has more to do with the foundational legacy of loving obsessive attention from quality @ bitrate obsessed video pirates than any particular underlying bitstream features. I bet an x262 could have outperformed any MPEG-2 encoder for anime DVDs, for example.
Thanks for your insight.
Would you attribute this mostly to excellent psychovisual tuning or are video streams of small dimensions with a lot of motion and 4x4 blocks areas where AV1 might always be much better even in a theoretical best case scenario?
utack is offline   Reply With Quote
Old 5th January 2019, 21:10   #1359  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by utack View Post
Thanks for your insight.
Would you attribute this mostly to excellent psychovisual tuning or are video streams of small dimensions with a lot of motion and 4x4 blocks areas where AV1 might always be much better even in a theoretical best case scenario?
H.264 and HEVC both have 4x4 blocks as well, so that feature alone isn’t going to be make-or-break.

As for comparing formats, the codec specs are like what you have in your fridge. The encoder is like the cook. A great cook can make simple ingredients into something wonderful, and a terrible cook can make a disaster out of the best ingredients. A great cook with a wide variety of great ingrediants is what gives the optimal results.

In comparing codecs, all we really can compare is the dishes that come out of the kitchen, though. Is a meal great or bad due to the cook or the ingredients? It’s hard to say and involves a lot of educated guesses and speculation.

For example, x264 with —preset placebo —tune film is probably going to produce better quality @ bitrate with typical content than libaom at its absolute fastest settings. It’s really quality @ perf @ bitrate, and that’s controlled by encoder optimization even more than the bitstream format. Stuff like AVX2 optimization will produce better quality within a given bitrate @ perf, because more options get tried and tools get used. And that’s with absolutely no change to psychovisual tuning or bitstream. It’s just the same results, faster.

Of course even that can be impacted by bitstream details. The bigger block sizes of HEVC mean that AVX2 and AVX512 offer bigger gains than with x264. Even choice of processor can change relative quality @ perf @ bitrate, as differenr encoders make better or worse use of lots of cores or more advanced SIMD.

We can only really know how “good” HEVC or AV1 or VVC are based on the best avaialable encoder for a given use case. And that can be hard to predict. Certainly cable MPEG-2 is a lot more efficient today than anyone predicted or could demonstrate when MPEG-2’s spec was finished.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 7th January 2019, 13:08   #1360  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,120
http://www.streamingmedia.com/Articl...nt-128870.aspx
hajj_3 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.