Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#1221 | Link | |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 5,905
|
Quote:
|
|
![]() |
![]() |
![]() |
#1222 | Link |
Registered User
Join Date: May 2014
Posts: 170
|
ffmpeg -hide_banner -t 10 -c:v libaom-av1 -i 1.mp4 -benchmark -f null - (43 fps)
ffmpeg -hide_banner -t 10 -c:v libdav1d -i 1.mp4 -benchmark -f null - (52 fps) ffmpeg -hide_banner -t 10 -c:v libdav1d -threads 1 -tilethreads 2 -i 1.mp4 -benchmark -f null - (61 fps) ffmpeg -hide_banner -t 10 -c:v libdav1d -threads 2 -tilethreads 2 -i 1.mp4 -benchmark -f null - (65 fps) |
![]() |
![]() |
![]() |
#1224 | Link | |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 92
|
Quote:
Still, MMX is hardly relevant nowadays. SSE4.1 as the lowest bar doesn't sound too unreasonable Also relevant: https://code.videolan.org/videolan/d.../15#note_22262 Last edited by SmilingWolf; 11th November 2018 at 18:34. |
|
![]() |
![]() |
![]() |
#1225 | Link | ||
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,708
|
Quote:
Quote:
SSEx should be the base as it is 128bit with very fast implementation on all CPUs of the last 10 years. Especially SSE2 is mandatory for x64 architecture. From the last link it's obvious that dav1d developers targeted AVX2 for 256bit acceleration using ASM, but not exclusively. They are going to optimise for SSEx later. So no worries, I think.
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
||
![]() |
![]() |
![]() |
#1227 | Link |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 428
|
You can also usually safely target SSE3 (no, not SSSE3) as well since it's supported on all DDR2-capable 64bit x86 CPUs and newer.
(the only 64bit x86 CPUs that don't support SSE3 are some socket 754 and 939 Athlon 64s which used DDR1) |
![]() |
![]() |
![]() |
#1230 | Link | |
Registered User
Join Date: Jul 2018
Posts: 61
|
SSE3-optimised av1_nn_predict
https://aomedia.googlesource.com/aom...6f313f27b1c501 Quote:
|
|
![]() |
![]() |
![]() |
#1231 | Link |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 428
|
...but this is exactly what I alluded to?
Athlon 64 CPUs are available on socket 754, 939, and AM2; 754 and 939 used DDR1 memory while AM2 used DDR2, and all AM2 CPUs support SSE3. (there are some socket 754 and 939 CPUs that support SSE3, though it's kind of hit and miss). Phenom for reference requires at least DDR2. |
![]() |
![]() |
![]() |
#1234 | Link |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 92
|
Status report!
Previous edition: http://forum.doom9.org/showthread.ph...49#post1852449 Whatever paragraph I don't repeat here can be assumed to be the same as in the aforementioned post First of all: graphs! Click to enlarge Y axis: chosen metric X axis: bits per pixel 720p: ![]() ![]() 1080p: ![]() ![]() BD rates for 720p: Code:
x264 -> rav1e (yeah you read that right!) RATE (%) DSNR (dB) MSSSIM -0.736889 0.0375593 PSNRHVS -5.5274 0.375081 rav1e -> x265 RATE (%) DSNR (dB) MSSSIM -26.5291 1.29942 PSNRHVS -27.1134 1.70509 x265 -> libaom RATE (%) DSNR (dB) MSSSIM -18.9088 0.7852 PSNRHVS -15.3123 0.761791 Code:
x264 -> rav1e (yeah you read that right again!) RATE (%) DSNR (dB) MSSSIM -4.92009 0.235151 PSNRHVS -7.23088 0.473125 rav1e -> x265 RATE (%) DSNR (dB) MSSSIM -26.7063 1.16103 PSNRHVS -28.0007 1.53902 x265 -> libaom RATE (%) DSNR (dB) MSSSIM -26.486 0.938124 PSNRHVS -21.7431 0.905916 x264 157-2935-545de2f x265 2.9-4-471726d3a046 rav1e 0.1.0-702-ab4d23e2 libaom 1.0.0-908-g3a607f7b0 Cmdlines: x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m x265 --preset veryslow --tune ssim --crf 16 -o test.x265.crf16.hevc orig.i420.y4m rav1e --low_latency false -o test.rav1e.cq80.ivf --quantizer 80 -s 2 --tune psnr orig.i420.y4m aomenc --frame-parallel=0 --tile-columns=3 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --threads=2 --end-usage=q --cq-level=20 --test-decode=fatal -o test.av1.cq20.webm orig.i420.y4m Notes: So as you can see, the rav1e and aomenc cmdlines have been slightly adjusted to take advantage of the bugfixes and updates from the last months. In particular, rav1e has been gifted by Frank Bossen the ability to create a B-pyramid, which almost single handedly decreed rav1e's advantage over x264. A word of warning on this last point: it's still kind of a mixed bag. In very flat, static scenes like PresageFlowerWalk x264 still rules by quite a margin, while rav1e takes the crown in clips like F.Y.C and PresageFlowerFight Code:
F.Y.C, x264 -> rav1e: RATE (%) DSNR (dB) MSSSIM -18.451 1.01281 PSNRHVS -25.7463 2.03419 PresageFlowerFight, x264 -> rav1e: RATE (%) DSNR (dB) MSSSIM -31.4953 1.80761 PSNRHVS -31.0827 2.27546 PresageFlowerWalk, x264 -> rav1e: RATE (%) DSNR (dB) MSSSIM 66.2264 -1.70084 PSNRHVS 70.8208 -2.28853 Considerations about times with libaom: I'm using my desktop PC to run all the encodes. It is also my main study/work PC, so the times can come quite off. Plus, I run multiple encodes in parallel, which further messes up timings. HOWEVER, between annoying bugs and a lot of stuff, the first report did cost me nearly a week of time (this includes having to re-run some encodes because sh*t happened) ONLY to encode with libaom. Taking advantage of the recent bugfixes and improvements I have been able to rework my workflow and bring down that time to a couple days only, WITHOUT having to touch the --cpu-used parameter and no night time encoding. All in all, I am pretty satisfied. This concludes my (bi-monthly?) report. As always, I'm open to any kind of feedback to improve my comparisons and my encodes. Last edited by SmilingWolf; 15th November 2018 at 00:34. |
![]() |
![]() |
![]() |
#1235 | Link |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 2,984
|
So, what's everyone's favorite AV1 decoder app on Windows? Chrome looks to be not converting from video to PC range correctly (blacks are washed out, contrast is low, etcetera). Is there a nightly of something that does AV! correctly for apples-apples?
|
![]() |
![]() |
![]() |
#1236 | Link | |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 92
|
Quote:
In alternative, ffplay for quick stuff when I already have a bunch of command prompts open in the right path. Last edited by SmilingWolf; 17th November 2018 at 12:30. |
|
![]() |
![]() |
![]() |
#1237 | Link |
German doom9/Gleitz SuMo
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 5,964
|
I use almost only MPC-HC. Which uses LAV Filters with a direct API. It was able to play AV1 clips from the YouTube beta playlist and some tiny own encodes (I don't have powerful CPU's available). So, only a limited experience, yet, but it appears to work.
|
![]() |
![]() |
![]() |
#1238 | Link |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 92
|
32/64bits binaries (GCC 9.0):
av1-1.0.0-941-gd2a592e1c: https://mega.nz/#!F5Am2KyK!9aQ6_7mM2...6_OaZahvKCHPWQ |
![]() |
![]() |
![]() |
#1239 | Link |
Registered User
Join Date: Jan 2007
Posts: 735
|
You could get the same results by splitting manually into X parts end encode them separately at once. I'm not sure how much does libvpx/libaom count with that. It works great with x264 and x265 (using raw output at least).
|
![]() |
![]() |
![]() |
#1240 | Link | |
Registered User
Join Date: Jan 2007
Posts: 735
|
Quote:
You probably mean SSSE3 (SSS instead of SS) aka "Suplemental SSE3" which is a confusing and dumb name. Probably should have been SSE4 but got renamed for marketing reasons. Or SSE3 was not supposed to be SSE3 originally. SSSE3 is very useful for encoding and decoding, but only comes on Core 2 chips, and Bobcat/Bulldozer and later cores from AMD. K10 and K8 end at the not-so-important SSE3. (Note that x265 actually needs SSSE3 + SSE4 to be useful, you are barred from most of assembly optimization if you only have SSSE3, like with 65nm Core 2s or pre-Sandy Bridge Pentium/Celeron). Last edited by mandarinka; 19th November 2018 at 11:01. |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|