Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
12th December 2018, 12:02 | #1281 | Link | ||
Registered User
Join Date: May 2014
Posts: 292
|
ffmpeg-4.2-92396-g55e021f39b - libaom 1.0.0-902-g03d8ebedc - libdav1d 58fc516 Quote:
ffmpeg-4.2-92681-0e833f6 - libaom 1.0.0-1028-78e6b2c - libdav1d 0.1.0 73067e5 Quote:
Last edited by Gravitator; 12th December 2018 at 12:08. |
||
12th December 2018, 12:50 | #1282 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
|
Quote:
If you don't have AVX2, the decoder is still being bottlenecked quite heavily, and also won't thread quite as nicely because reference frames take too long to decode, for example. The SSSE3 work is still at early stages - if you look at the ticket linked above, only a small part of assembly has been covered in SSSE3 yet.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders |
|
12th December 2018, 14:39 | #1283 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
SSSE3 code base is fundamental because it's the first instruction test supported by all Core 2 Duo and above (not Pentium 4) and also it's very useful for decoding (at least on previous codecs like H.264/H.265)
But I don't know if they want to go back to even older instruction sets and CPUs like SSE2. We'll see.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
12th December 2018, 15:24 | #1285 | Link | |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 95
|
Quote:
|
|
12th December 2018, 15:30 | #1286 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
|
Quote:
And in all honesty, if you bought a K10 in 2012 or anywhere near to that, you just did it wrong, even on the low-end market. Intel introduced SSSE3 all the way back in 2006, afterall. Its hardly "new" even in 2012. Ultimately its up to the developers how they want to spend their time, but as mentioned in the ticket linked above, pure SSE2 is often a lot more painful to write then using SSSE3 enhancements.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders Last edited by nevcairiel; 12th December 2018 at 15:38. |
|
12th December 2018, 15:38 | #1287 | Link | |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
So, SSSE3 is the minimum. Little pity for AMD CPUs.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
|
12th December 2018, 15:39 | #1288 | Link |
Registered User
Join Date: Dec 2002
Posts: 5,565
|
Steam HW Survey says 3% don't have SSSE3, only 0.01% don't have SSE3.
https://store.steampowered.com/hwsurvey |
12th December 2018, 15:43 | #1289 | Link |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
|
SSE3 (without the third S) is mostly useless for video. Its primarly floating-point.
For video, which needs integer instructions, you only have a few meaningful steps: (everything left out is mostly floating point or otherwise not related, like SSE3, AVX1, etc). - MMX - SSE2 - SSSE3 - SSE4.1 - AVX2 - AVX512 Obviously noone cares about MMX anymore. SSE4.1 is only useful in special cases. And obviously AVX512 is not rolled out and perhaps even understood widely enough yet, maybe in a few years. So, by and large, that leaves SSE2, SSSE3, AVX2. The difference between SSE2 and SSSE3 is not gigantic, same 128-bit registers afterall, SSSE3 only adds a bunch of new instructions - but some of those are really useful and make code much simpler and easier to write.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders Last edited by nevcairiel; 12th December 2018 at 15:48. |
12th December 2018, 16:33 | #1290 | Link |
*****
Join Date: Feb 2005
Posts: 5,647
|
The optimizations in Dav1d are currently mostly for 8-bit only. So for 10-bit libaom may still be faster.
Development pace in Dav1d is pretty high, so we will have a fast decoder long before there is actual widespread AV1 content (beyond the current demo files and a few Youtube videos). |
12th December 2018, 16:34 | #1291 | Link | |
Registered User
Join Date: Dec 2008
Posts: 1,968
|
I test again. i5-3570K.
Code:
libaom-av1 - max 14 fps libdav1d - max 7.2 fps libdav1d -threads 4 -tilethreads 4 - max 9.7 fps libdav1d -threads 8 -tilethreads 1 - max 10 fps Quote:
I'm waiting for the dav1d to be faster on my processor. I want to see truthful information, not PR.
__________________
MPC-BE 1.7.0 and Nightly builds | VideoRenderer | ImageSource | ScriptSource | BassAudioSource |
|
12th December 2018, 16:53 | #1292 | Link | ||
I am maddo saientisto!
Join Date: Aug 2018
Posts: 95
|
Quote:
The blogpost links twice to this previous one for detailed perf reports: http://www.jbkempf.com/blog/post/201...-first-release Quote:
Then the same post you claim to have read very carefully states that work on SSSE3 has only just begun. Since the Pentium G5600 only supports extensions up to SSE4.2 it's clear you'll have to wait some more. Spare the rage and read some more Last edited by SmilingWolf; 12th December 2018 at 17:00. |
||
12th December 2018, 17:18 | #1293 | Link | |
Registered User
Join Date: Dec 2008
Posts: 1,968
|
Quote:
This information I could find only in the discussion of beta testing.
__________________
MPC-BE 1.7.0 and Nightly builds | VideoRenderer | ImageSource | ScriptSource | BassAudioSource |
|
12th December 2018, 17:29 | #1294 | Link | |
I am maddo saientisto!
Join Date: Aug 2018
Posts: 95
|
Quote:
You have the wunderbar vector extensions: you have the speedup these provide. You can't use the vector extensions: you're going to run on C code, which is gonna be slower. Which is the reason these multimedia extensions exist in the first place. Doesn't really take a degree to understand. I got it, everyone around here got it, it seems you're the only one left out. Wonder where the problem lies? |
|
12th December 2018, 18:06 | #1295 | Link |
Registered User
Join Date: Jan 2002
Posts: 332
|
And if you want the latest info for SIMD you should look :
AVX2 https://code.videolan.org/videolan/dav1d/issues/78 SSSE3 https://code.videolan.org/videolan/dav1d/issues/216 ARM / NEON https://code.videolan.org/videolan/dav1d/issues/215 As you can see for AVX2 it's pretty much done, but only a few for others. And that only for 8bit if i'm correct. |
12th December 2018, 19:40 | #1296 | Link | |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
|
Quote:
As someone with both a Phenom II x4 and a Core 2 Quad (actually a Phenom II x2 unlocked to x4 and a quad Wolfdale Xeon), I find that the latter has pretty sub-par multicore scaling in video workloads - yes it's faster than a Core 2 Duo, but not quite at the level that you'd expect as I showed in my post two pages back (if Wolfdale had the same scaling from 2c/2t to 4c/4t as Nehalem, then 4c/4t Wolfdale would've only needed ~2.4GHz, not 2.7GHz) This then commonly results in the Phenom actually performing similar to if not better than the Core 2 Quad on a per-GHz basis assuming the tested code isn't heavily relying on SSSE3 or SSE4.1 (as is obviously the case currently with AV1 decoding), and the Phenom not only tended to have higher stock clocks but even came in 6 core variants as well. Similarly, I've also previously documented that the Phenom II is faster than Core 2 Quad clock-for-clock in SVP video interpolation (which is a task that loves "moar cores!" and SMT threads).
__________________
____HTPC____ | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258 Radeon HD5870 | Intel iGPU 2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600 |
|
12th December 2018, 20:06 | #1297 | Link | |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Quote:
|
|
12th December 2018, 21:22 | #1298 | Link | |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
|
Quote:
(and again, going forward the Athlon 200GE is a wiser choice of CPU, but that's only been on the market for a couple months now)
__________________
____HTPC____ | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.5GHz 1.24v dual-core Haswell G3258 Radeon HD5870 | Intel iGPU 2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600 |
|
13th December 2018, 12:08 | #1299 | Link |
Registered User
Join Date: Oct 2009
Posts: 930
|
Hi!
On the decoder sides Dav1d and libAOM are the only two options? I see Firefox has a Dav1d option, which doesn't work too well, because it freezes on the bitmovin demo. (I guess the other is libaom.) The default decoder plays the video completely smoothly now on my computer. PS: By the way, can I download these streams? The player is pretty trashy, the quality always resets and doesn't want to change unless I seek. |
|
|