Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
3rd October 2018, 23:03 | #1061 | Link |
Registered User
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 109
|
Depends on the implementation. FFmpeg's decoders for H264/HEVC typically use only frame threads in the default configuration, the exact number depends on the number of cores on your system. OpenHEVC allows you to combine frame and wave-front threading (similar to x265). Frame+Tile - like Frame+WFP - allows better scaling at ultra-high resolutions or very low-end systems, but is not typically necessary for real-time playback on normal systems.
|
4th October 2018, 03:25 | #1062 | Link | |
Registered User
Join Date: Jun 2015
Posts: 21
|
Quote:
CDF tables can be updated (or not updated) at the end of decoding for a frame, depending on a frame header syntax element in the bitstream. If the CDF tables need to be updated at the end of the decoding for a frame, then frame threading should be impossible. |
|
4th October 2018, 05:08 | #1063 | Link | |
Registered User
Join Date: Oct 2017
Posts: 56
|
Quote:
In WebM, the software license and patent license are clearly separated https://www.webmproject.org/license/ "The WebM codec source code and specification are licensed differently." In AOM, the software license and patent license are clearly combined https://aomedia.org/license/ "Software released by the Alliance for Open Media is made available under a combination of the following licenses" |
|
4th October 2018, 07:21 | #1065 | Link | ||
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Quote:
Quote:
Then it must be a great coincidence that all three AV1 clips I tried to decode had 75% CPU utilization on a four threaded CPU. Is it some kind of default settings of the encoder to produce such clips ? Can you post a sample that libaom decoder can use 4 threads or more during decoding ? Thank you.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
||
4th October 2018, 11:24 | #1066 | Link |
Registered User
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 109
|
The VDD presentation explains how we still accomplish frame threading in this scenario. Efficient frame threading is possible, even with CDF table dependencies. We did this in ffvp9 also, there is nothing new about this approach.
|
4th October 2018, 12:30 | #1067 | Link | |
Registered User
Join Date: Jun 2015
Posts: 21
|
Quote:
Do you mind to point out the link to the presentation if it is available? |
|
4th October 2018, 12:53 | #1068 | Link | |
Registered User
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 109
|
Quote:
|
|
4th October 2018, 19:20 | #1072 | Link | |
Registered User
Join Date: Oct 2009
Posts: 930
|
Quote:
|
|
4th October 2018, 23:14 | #1074 | Link |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
|
SVP.
Note that there are several variants of it:
One thing to keep in mind is that interpolating to refresh rates that are exact multiples of the source framerate will provide a smoother result with fewer artifacts - e.g. 30fps interpolated to 120Hz (4x) is better than 30fps interpolated to 144Hz (4.8x) - this is most easily accomplished with something like like MPC-HC's or madVR's built-in automatic resolution changer which can be used to change your refresh rate depending on a given video frame rate (though you may need to create a custom resolution in order to access certain refresh rates on your display). |
5th October 2018, 08:26 | #1076 | Link | |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
|
Quote:
If you don't like high framerate video, then no amount of motion interpolation is going to look good - that's the goal of motion interpolation after all. If you simply don't like interpolation artifacts, then certainly a piece of software designed to run at fullspeed with cranked settings on even a 10-year old quad core CPU isn't going to deliver a cleaner image than dedication interpolation hardware in modern TVs. It is worth mentioning however that the Linux version, Pro versions, and the old v3.1.7 do have "2m (min artifacts)" and "1.5m (less artifacts)" settings as well as a "32 px. Large 0" setting that are particularly ideal for people that dislike high motion interpolation but still want improved motion resolution. Additionally, it can make a big difference how your source content was originally recorded - my main use is for motorsports that are actually natively recorded and broadcast at 50fps but are then downsampled to 25fps for internet streaming (I'm looking at you Formula E). These sorts of 50fps --to-> 25fps content retains the faster camera shutter speed of the original 50fps recording which means that applying motion interpolation will make it look much more akin to native HFR content than what you'd get if you interpolated 24fps movie content recorded with a slow camera shutter speed (as is typically used in cinema). And in my opinion, applying motion interpolation with cranked settings for native 50fps motorsports on a 100Hz CRT is glorious - the sense of speed you get from the cars that way is just unmatched! ________EDIT________ But you didn't say what aspect it is that you don't like, so I felt it necessary to try to answer every angle I knew of. Now I know a lot of people don't like the artifacts, but that would practically require something like a Threadripper 2990WX + 64GB RAM + a GPU with Nvidia's A.I. Tensor cores in order to have truly artifactless interpolation that looks like real native HFR. Keep in mind however that the higher the native frame rate of the video, the less artifacts there are: 50fps --to-> 100Hz has quite a bit fewer artifacts than 25fps --to-> 100Hz, and since 50fps is natively HFR anyway the overall "feeling" isn't exactly Earth-shatteringly different either when using interpolation. Also interpolation in general can be a god-send for low framerate content like 15fps (which is what my father's smartphone camera uses for videos recorded in low-light situations) that can otherwise look really choppy (such as a recently recorded fireworks video he took). Hey now, you were the one that decided to not ignore the subject and "let my post be". It was even at the end of a page which would have allowed it to have easily been ignored. Last edited by Nintendo Maniac 64; 7th October 2018 at 02:20. |
|
7th October 2018, 03:02 | #1078 | Link |
Registered User
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 447
|
It seems that LAVfilters v0.73 / MPC-HC v1.8.3 improves AV1 performance a tad (~20% faster), but it's multi-threaded utilization is still poor (only ~26% utilization with 4c/8t Nehalem).
The biggest benefit I found is that only using dual core no longer completely tanks performance. Utilizing a Nehalem Xeon x3470, I had the following performance when trying to decode the 1080p AV1 video-only stream from Gus Kenworthy & Tom Wallisch X Games Slopestyle GoPro Preview in MPC-HC v1.8.3 64bit:
Anything above 3c/3t still sees no benefit, and SMT still sees no utilization (though Nehalem's implementation of SMT is certainly going to be weaker than more modern implementations e.g. Ryzen). |
7th October 2018, 20:36 | #1079 | Link | |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Quote:
That kind of pixel fill rate is normally the domain of hardware decoders, or at least hybrid CPU-GPU implementations (like the Xbox 360 H.264 decoder). For commercial content, the DRM hardware requirements to play UHD on a PC has always come on systems that have a 2160p60 HW decoder anyway. Since there is already a large installed base with has the DRM support bu not HW AV1, I would expect HEVC to remain the dominant codec for delivering premium UHD content for years to come. But I believe Profile @ Level for AV1 requires some degree of tiling for UHD resolutions. Even if not required, I imagine some de facto guidelines about tiling to improve decode perf would become standard. Parallelizing decode of Golden/IDR, I, P, B, and b frames can also be a useful technique if plenty of memory is available. You'd just do a lookahead to encode and buffer the referenced frames in tier order. That was nigh impossible with VP9, but I think is feasible for AV1. |
|
7th October 2018, 22:08 | #1080 | Link |
German doom9/Gleitz SuMo
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,782
|
New uploads: (MSYS2; MinGW32: GCC 7.3.0 / MinGW64: GCC 8.2.0)
AOM v1.0.0-735-g9b21428c8 rav1e 0.1.0 (4d185f7 / 2018-10-04) dav1d 0.0.1 (c6788ed / 2018-10-04) |
|
|