Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd July 2019, 18:17   #1761  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Blue_MiSfit View Post
Great talk from Ronald. If only I had the time to do an evaluation of Eve_AV1
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 18:19   #1762  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by dapperdan View Post
I think the x265 release used was the last stable release so that doesn't seem too crazy. You could easily cry foul if someone used a non stable git commit and it performed worse than expected due to hitting a bug.
And there weren't any substantial quality improvements inx x265 between the Jan 2019 builds before the June 3.1 release.

How the encoders got tuned is what matters. And if quality is being compared at fixed encoding time, performance improvements become quality improvements.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd July 2019, 19:31   #1763  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 201
Quote:
Originally Posted by benwaggoner View Post
These are double-blind tests; the people doing it just compare two encodes.
My point still stands for tests that intend to be double-blind since some people's abilities and/or preferences would effectively unblind the test.

Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.

On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.

I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.

I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.
dapperdan is offline   Reply With Quote
Old 2nd July 2019, 19:54   #1764  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 201
Quote:
Originally Posted by benwaggoner View Post
That said, I've not seen any study demonstrating better subjective quality from a well-tuned libpvx encode versus a well-tuned x265 encode, using the same bitrate @ time.
Have you read the full version of the last MSU subjective comparison? I've only read the free snippet, which doesn't have enough info to say either way, but it's possible that fits the criteria or is at least in the right ballpark, potentially a statistical tie:

http://www.compression.ru/video/code...jective_report

On the other hand, similar to how complaints about electric cars are now "I don't like the minimalism of their touchscreen interfaces" when not too long ago you'd hear how they were physical impossibilities, I think the fact that we're now at this level of complaint for the previous generation of royalty-free codecs is a testament to how far we've come.
dapperdan is offline   Reply With Quote
Old 2nd July 2019, 21:00   #1765  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 196
Quote:
Originally Posted by benwaggoner View Post
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.
soresu is offline   Reply With Quote
Old 2nd July 2019, 21:39   #1766  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
Re: Eve evaluation, I've never tried, TBH. I've been wanting to spend more time looking at Beamr 5x (fantastic so far!) but have been quite busy.
Blue_MiSfit is offline   Reply With Quote
Old 3rd July 2019, 01:30   #1767  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by dapperdan View Post
Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.
Yep, and then we're back into "Zen and the Art of Motorcycle Maintenance" style philosophical ruminations on the nature and meaning of "quality." Which is unavoidable at a certain point, which is why we try to test for something more specific than just quality. Accuracy to a source and creative intent can be quite different from a no-reference "is this clip pleasing" or "do you see anything wrong with this clip?"

Quote:
On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.
Exactly. And it's not a particularly hard thing to synthesize. It's not like the film grain in Marvel movies is ACTUALLY grain-from-film. It's digitally synthesized. Grain helps make blending in VFX a lot easier, and allows for rendering at 2K instead of 4K.

Quote:
I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.
It depends on what the question they were asking was intended to be. But yeah, having a bunch of video engineers look at something is very different than having the general public look at something, and can provide different (but both useful!) answers. Video engineers are going to pick up on more subtle things, and are going to care about accuracy and creative intent more.

Generally I'll have video experts to an initial pass on something to see "is there something that can be seen here?" and then using double-blind testing with a more general population to confirm details. The second is a LOT slower and more expensive than the first, of course.

Quote:
I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.
Oh, no doubt. I've done that party trick **many** times. x264 versus WMV3 versus VC-1 versus Main Concept versus VP9; it's generally pretty obvious if you've been in the field for a while.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 3rd July 2019, 13:45   #1768  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 109
Quote:
Originally Posted by benwaggoner View Post
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.
Have you asked?
Beelzebubu is offline   Reply With Quote
Old 3rd July 2019, 22:55   #1769  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by soresu View Post
Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.
To be clear, I speak only for myself, not Amazon, on these forums. I actually tinker with video stuff on my off hours too. I should probably get out more.

Anyway, I requested an optimal Eve encoding for My encoding challenge, but they declined to participate.

It is common for encoder vendors who think they are doing some magic things in the bitstream to want to have the bitstream output under NDA and such. I get the impulse, but it just isn't practical for doing actual comparisons or due diligence evaluation.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book

Last edited by benwaggoner; 3rd July 2019 at 22:59. Reason: More details about Eve offer
benwaggoner is offline   Reply With Quote
Old 4th July 2019, 08:12   #1770  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 201
When you say they "declined to participate" did they respond and say they didn't want to take part or did you just not hear from them after making a broad request in a forum post?

I believe the comment above yours saying ("Have you asked?") Is written by a developer of EVE, which suggests they didn't know they'd been asked, so possibly an email has got lost in a spam trap.
dapperdan is offline   Reply With Quote
Old 4th July 2019, 18:07   #1771  |  Link
Ilya87
Registered User
 
Join Date: Feb 2019
Posts: 3
Hi guys, I've desided to make a comparison of x264, rav1e and x265 encoders with 500, 600, 700, 800, 900, 1000 kbit/s with the following settings:

rav1e -b $g --tiles 6 -s 5 --matrix BT470BG /D/sintel/sintel720.y4m --output sintel720_rav1e_s5_$g.ivf
rav1e -b $g --tiles 6 -s 3 --matrix BT470BG /D/sintel/sintel720.y4m --output sintel720_rav1e_s5_$g.ivf
x264 -t 2 -m 11 --me umh --weightp 2 --direct spatial --aq-mode 2 --b-adapt 2 -B $g -b 4 -r 6 -I 240 --b-pyramid normal --no-dct-decimate --no-fast-pskip -A all -o sintel720_x264_$g.264
x265 /D/sintel/sintel720.y4m --y4m -o "Sintel720_x265_$g.h265" --rd 3 -b 4 --b-adapt 2 --b-pyramid --ref 6 -I 240 --bitrate $g --aq-mode 2 --weightp --weightb -m 2 --no-early-skip --psy-rd 1 --me star
where $g stands for bitrate value

My OS is Arch Linux x86_64 and CPU Core i5 8600K, rav1e was build recently and for testing 1191 frames of sintel 1k 16bit (from 12987 to 14177) were taken and converted to 720x306 yuv420p. x265 and x264 are from the distro's repository.

To measure MS-SSIM and PSNR-HVS-M daala's tools were used. To measure VMAF score I used ffmpeg's VMAF filter.

Results:






x265 is a clear winner with 50.59-61.01 fps (lowest to highest bitrate settings)
x264 80.08-102.73 fps
rav1e s3 1.603-2.187 fps
rav1e s5 3.736-4.469 fps

Average CPU utilization of rav1e was 66%-70% (and I couldn't increase it).

Last edited by Ilya87; 4th July 2019 at 23:42.
Ilya87 is offline   Reply With Quote
Old 4th July 2019, 22:44   #1772  |  Link
marcomsousa
Registered User
 
Join Date: Jul 2018
Posts: 80
Intel SVT-AV1 0.6 Released With AV1 Decoding, SIMD Optimizations

https://www.phoronix.com/scan.php?pa...1-0.6-Released
__________________
AV1 win64 VS2019 builds
Last build here
marcomsousa is offline   Reply With Quote
Old 4th July 2019, 23:35   #1773  |  Link
Ilya87
Registered User
 
Join Date: Feb 2019
Posts: 3
Quote:
Originally Posted by marcomsousa View Post
Intel SVT-AV1 0.6 Released With AV1 Decoding, SIMD Optimizations

https://www.phoronix.com/scan.php?pa...1-0.6-Released
Still not supported dimensions multiple 2, only 8. Still segfaults. And many other bugs.
Ilya87 is offline   Reply With Quote
Old 7th July 2019, 20:23   #1774  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Ilya87 View Post
Still not supported dimensions multiple 2, only 8. Still segfaults. And many other bugs.
It is a 0.6 release. I'd expect a smaller set of limitations like that and other issues will still be in 0.7.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th July 2019, 05:39   #1775  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
While now a version old, Phoronix tested (on Linux) the encoding performance of SVT-AV1 v0.5 on the new 3rd gen AMD Ryzen 8core (3700X) and 12core (3900X) chips compared to existing Intel CPUs (primarily the 8core 9900K and 16core 7960X):

https://www.phoronix.com/scan.php?pa...0x-linux&num=4
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       
Nintendo Maniac 64 is offline   Reply With Quote
Old 9th July 2019, 05:46   #1776  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 196
Well there goes my bank account down the tubes after seeing those Ryzen 3000 results - roll on september so I can become poor and happy with my 3950X.
soresu is offline   Reply With Quote
Old 10th July 2019, 17:38   #1777  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Nintendo Maniac 64 View Post
While now a version old, Phoronix tested (on Linux) the encoding performance of SVT-AV1 v0.5 on the new 3rd gen AMD Ryzen 8core (3700X) and 12core (3900X) chips compared to existing Intel CPUs (primarily the 8core 9900K and 16core 7960X):

https://www.phoronix.com/scan.php?pa...0x-linux&num=4
Wow, interesting result! And no way did Intel worry about AMD optimizations when compiling SVT I wonder what the comparison between cpu-tuned x265 and libaom would be like, which should tilt more in AMD's favor.

I note that the top Intel processor used has only half the cores as the top AMD, so this difference could easily be due to multithreading more than per-core performance improvements. But that in no way invalidates the price/performance delta.

Also, and AV1 encoder that's running only ~2.5x slower than a HEVC encoder! Of course, I have no idea if the output quality is similar. As always, the key metric is quality @ bitrate @ performance.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book

Last edited by benwaggoner; 10th July 2019 at 17:39. Reason: Used wrong word
benwaggoner is offline   Reply With Quote
Old 10th July 2019, 17:45   #1778  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Also, I note that the Intel processor used in comparison is from 2017. The current equivalent would probably be the i9-9980XE, which as two more cores and 7% faster clock. That would probably have similar SVT performance to the Threadripper. At more than 2x the price, though (although for an encoding workstation/instance, the CPU is typically less than half the cost).
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 10th July 2019, 20:18   #1779  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 95
Status report!
"Yes I keep tweaking the params" edition

1st edition: https://forum.doom9.org/showthread.p...49#post1852449
2nd edition: https://forum.doom9.org/showthread.p...87#post1857587
3rd edition: https://forum.doom9.org/showthread.p...75#post1860475
4th edition: https://forum.doom9.org/showthread.p...39#post1871939
Whatever paragraph I don't repeat here can be assumed to be the same as in the aforementioned post

First of all: graphs!
Click to enlarge

Y axis: chosen metric
X axis: bits per pixel

720p:


1080p:


BD rates for 720p:
Code:
Codecs ladder:              |  x264 relative:
x264 -> svtav1              |  x264 -> svtav1
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -10.5381 0.426713   |   MSSSIM -10.5381 0.426713
PSNRHVS -11.296  0.557542   |  PSNRHVS -11.296  0.557542
  HVMAF -19.6867 0.689824   |    HVMAF -19.6867 0.689824
----------------------------|-----------------------------
svtav1 -> vp9               |  x264 -> vp9
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -12.4136 0.464516   |   MSSSIM -24.2802 1.23124
PSNRHVS -13.288  0.615572   |  PSNRHVS -25.1991 1.68477
  HVMAF -14.5152 0.598246   |    HVMAF -26.3686 2.81799
----------------------------|-----------------------------
vp9 -> x265                 |  x264 -> x265
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -1.73618 0.0667664  |   MSSSIM -26.2541 1.24552
PSNRHVS -6.07444 0.298073   |  PSNRHVS -30.4815 1.87719
  HVMAF -9.04578 0.359953   |    HVMAF -31.4265 3.28152
----------------------------|-----------------------------
x265 -> av1                 |  x264 -> av1
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -20.8531 0.881529   |   MSSSIM -39.9238 2.1343
PSNRHVS -16.9627 0.860883   |  PSNRHVS -40.3335 2.76154
  HVMAF -23.5865 1.00102    |    HVMAF -48.1341 3.64521
BD rates for 1080p:
Code:
Codecs ladder:              |  x264 relative:
x264 -> svtav1              |  x264 -> svtav1
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -14.3136 0.452642   |   MSSSIM -14.3136 0.452642
PSNRHVS -10.1078 0.374405   |  PSNRHVS -10.1078 0.374405
  HVMAF -20.4048 0.58988    |    HVMAF -20.4048 0.58988
----------------------------|-----------------------------
svtav1 -> vp9               |  x264 -> vp9
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -19.1279 0.563386   |   MSSSIM -34.6951 1.70828
PSNRHVS -21.5428 0.778635   |  PSNRHVS -33.6391 2.16168
  HVMAF -21.4399 0.750138   |    HVMAF -34.3162 3.93015
----------------------------|-----------------------------
vp9 -> x265                 |  x264 -> x265
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM 8.56339 -0.282927   |   MSSSIM -30.5146 1.24699
PSNRHVS 3.02814 -0.139956   |  PSNRHVS -32.9536 1.71646
  HVMAF -3.70741 0.0299945  |    HVMAF -35.6727 3.2304
----------------------------|-----------------------------
x265 -> av1                 |  x264 -> av1
        RATE (%) DSNR (dB)  |          RATE (%) DSNR (dB)
 MSSSIM -28.044  1.00637    |   MSSSIM -47.6676 2.30149
PSNRHVS -23.4583 0.991831   |  PSNRHVS -45.8303 2.79923
  HVMAF -26.6387 0.978822   |    HVMAF -51.9814 3.88658
Encoders:
x264 157-2970-5493be8
x265 3.1-4-4f6dde51a5db
libvpx-vp9 1.8.0-591-g19bda215d
SVT-AV1 0.6.0-1424-8977f443
libaom 1.0.0-2036-ge2c1d5ef8

Cmdlines:
x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m
x265 --preset veryslow --tune ssim --crf 16 -o test.x265.crf16.hevc orig.i420.y4m
vpxenc --codec=vp9 --frame-parallel=0 --tile-columns=0 --auto-alt-ref=6 --good --cpu-used=0 --tune=psnr --passes=2 --threads=1 --end-usage=q --cq-level=20 --test-decode=fatal --ivf -o test.vp9.cq20.ivf orig.i420.y4m
SvtAv1EncApp.exe -i orig.i420.yuv -b test.svtav1.cq20.ivf -w 1280 -h 720 -q 20 -enc-mode 3 -fps-num 24000 -fps-denom 1001 -intra-period 23
aomenc --frame-parallel=0 --tile-columns=0 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --threads=2 --row-mt=1 --end-usage=q --cq-level=20 --test-decode=fatal -o test.av1.cq20.webm orig.i420.y4m
VMAF: model used: vmaf_b_v0.6.3, pooling: harmonic_mean, bagging score (arithmetic mean of 21 models' scores)

Notes:
TearsOfSteel720 and TheFifthElement, two clips in the 720p category, had a vertical resolution incompatible with SvtAv1EncApp (not divisible by 8).
They have been padded to 1280x536, so they have been included in this round of measurements again.
Meanwhile, rav1e still has got a nasty bug that makes it bloat encodes, which brings up to 25% BD rate regression, so it has been excluded from this edition.
Again, no time infos because I use the PC while it encodes etc. etc.
If somebody REALLY wants some encoding time infos I can run a battery of encodes under ideal conditions on my favourite 1080p clip (PresageFlowerFight) and report the stats in a followup post (ping @benwaggoner)

This concludes this report.
As always, I'm open to any kind of feedback to improve my comparisons and my encodes.

Last edited by SmilingWolf; 10th July 2019 at 20:25.
SmilingWolf is offline   Reply With Quote
Old 10th July 2019, 20:41   #1780  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
Quote:
Originally Posted by benwaggoner View Post
I note that the top Intel processor used has only half the cores as the top AMD, so this difference could easily be due to multithreading more than per-core performance improvements.
Keep in mind that the 2990WX uses a very nontraditional CPU die topology that makes it more akin to something like a dual-socket system with two full CPUs. It's so nontraditional that you basically need to use Linux to get any semblance of good performance at all (Wendell from level1techs did a good analysis on the subject in this video here).

You can also see from the results that even normal Threadripper like the 12core 2920X (which still uses a somewhat nonstandard die configuration) is getting beaten by the 9900K and 3700X which both use a very traditional CPU core configuration by comparison (one could even argue that the separate I/O die on the 3700X is actually more traditional and is akin to the days of northbridges and external memory controllers ala Athlon XP and Core 2 Duo).


Nevertheless, there could very well be a point of diminishing returns in terms of multicore scalabilty for SVT-AV1 that 32c/64t just isn't seeing the utilization that it could otherwise, and even more-so with such the nontraditional core arrangement of the 2990WX.


Quote:
Originally Posted by benwaggoner View Post
Also, I note that the Intel processor used in comparison is from 2017.
While true, keep in mind that the per-GHz performance on Intel has not changed at all and won't change until their 10nm parts.

Quote:
Originally Posted by benwaggoner View Post
The current equivalent would probably be the i9-9980XE, which as two more cores and 7% faster clock.
The i9-7960X was not the flagship part of its generation - there was in fact an 18core 7980XE during that gen as well (albeit with a bit lower base clock).

This tells me that Phoronix wasn't actually trying to use the highest-end Intel CPU parts that are available, even within a given CPU generation.
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       
Nintendo Maniac 64 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:02.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.