Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
23rd January 2020, 14:55 | #1145 | Link |
Registered User
Join Date: Mar 2011
Posts: 4,823
|
The clip seems to be a mixture of soft and hard telecine and interlaced/progressive. I tried 4 different decoding/filtering methods, all using ffms2 as I wanted to try a sample using the timecodes file it (optionally) creates when indexing.
RFFMode=0 is the same as Repeat=false for lsmash. 1 VFR ffms2 timecodes - 1762 frames.mkv The hard telecine parts are field matched and left at 29.97fps. Average frame rate 27.583fps. FFVideoSource("E:\lainvob.mkv", Threads=1, TimeCodes="FFMSTimecodes.txt", RFFMode=0) TFM() 2 VFR TIVTC 2 pass - 1662 frames.mkv Same as #1 except the hard telecine parts are decimated to 23.976fps. Average frame rate 26.028fps. FFVideoSource("E:\lainvob.mkv", Threads=1, RFFMode=1) TFM(Input="D:\TFM.txt") TDecimate(Mode=5, Hybrid=2, Input="D:\TDec.txt", tfmin="D:\TFM.txt", mkvout="D:\Times.txt") 3 CFR conversion ffms2 - 1916 frames.mkv The hard telecine parts are field matched and left at 29.97fps. The soft telecine parts would have every 4th frame duplicated. FFVideoSource("E:\lainvob.mkv", Threads=1, RFFMode=0, FPSNum=30000, FPSDen=1001) TFM() 4 CFR conversion TIVTC frame blending - 1916 frames.mkv The hard telecine parts are field matched and decimated, and the soft/hard telecined parts are frame blended to 29.97fps. FFVideoSource("E:\lainvob.mkv", Threads=1, RFFMode=1) TFM() TDecimate(Mode=1, Hybrid=3) All cropped and resized to 640x480. encodes.zip (72.6 MB) And one more at the average frame rate of 27.568fps (1762 frames). The A/V sync is definitely different. At roughly the 41 second point she should be mouthing the word "much" in sync with the audio, as happens with all the previous samples. FFVideoSource("E:\lainvob.mkv", Threads=1, RFFMode=0) TFM() Average frame rate.mkv manolito, for a method that always works with encoder GUI's wouldn't it be better to always honour repeat flags as DGIndex does by default? I don't understand the logic behind not doing so unless you want a VFR output, or there's not much soft telecined content and the repeated frames wouldn't be too noticeable. Admittedly PAL is generally much simpler to deal with, but for NTSC you can always tell if there's soft pulldown by disabling repeat flags and using Info() to check the frame rate. Last edited by hello_hello; 23rd January 2020 at 16:10. |
23rd January 2020, 14:58 | #1146 | Link |
Registered User
Join Date: Mar 2011
Posts: 4,823
|
repeat
Reconstruct frames by the flags specified in video stream and then treat all frames as interlaced if set to true and usable. I'm still not clear on whether the repeat flag effects the chroma upsampling (interlaced vs progressive) or am I looking at it all wrong? |
23rd January 2020, 17:31 | #1147 | Link |
Registered User
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,078
|
First of all I really do not want to do a thorough analysis of such clips using DGIndex. And I absolutely want to avoid VFR, all my conversions need to come out as CFR.
The 2 GUIs I use are AVStoDVD for DVD output and StaxRip 32-bit for other output formats (I also use DVDStyler and dmMediaConverter, but they do not count here because they do no use AviSynth). And both GUIs employ MediaInfo to detect the source properties. The clip which was uploaded by our friend is detected as 29.97 fps Interlaced TFF. No soft pulldown is detected. MediaInfo only reports 2:3 pulldown if the clip has the standard regular pulldown pattern for 23.976 to 29.97 telecining. Both GUIs ALWAYS do VFR to CFR conversion right at the source filter level. If the source aready is CFR then this operation is useless, but it won't do any harm either. The uploaded clip will be treated like this: DGIndex / DGDecode: The default "Honor Pulldown Flags" is used, followed by a ChangeFPS(29.97). Then users can either leave it interlaced (the default) or deinterlace the clip. IVTC is not offered because MediaInfo did not report pulldown. DSS2Mod (the same for DirectShowSource): DSS2Mod does not honor pulldown flags. The uploaded clip will have the "fps=29.97" parameter in the DSS2Mod call, so there will be lots of dupes. Then again you can leave it interlaced or use a deinterlacer. ffms2 and previous LSMASH builds: Pretty much the same as DSS2Mod. This is not optimal, but it is very watchable, and I never saw any A/V sync problems. If the source has regular 2:3 pulldown then it gets a little different. MediaInfo will report 23.976 progressive and 2:3 pulldown. When using DGIndex with the default Honor Pulldown Flags then AVStoDVD will force an IVTC right after the source filter. When DSS2Mod (or ffms2 or previous LSMASH) is used then the source filter will force VFR to CFR conversion to 23.976 fps. There will be dropped or duplicated frames, but again no A/V sync problems, and the output will be pure progressive. Last edited by manolito; 23rd January 2020 at 17:34. |
23rd January 2020, 17:56 | #1148 | Link | |
Registered User
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,078
|
Quote:
I found this out the hard way because it is not documented anywhere. If your source is a regular soft telecined NTSC Film clip and you open it with DSS2Mod (or DirectShowSource) without the "fps=" parameter then the output will be 29.97 fps. For some time I thought that DSS2Mod did indeed honor the pulldown flags, but I was quite wrong. The output had no pulldown pattern, it was purely progressive with lots of dupes. Which means that DSS2Mod decoded the clip to 23.976 first and then added dupes to bring the rate up to 29.97. I still think that this is a stupid behavior, so for sources with 2:3 pulldown I always specify "fps=23.976" in the DSS2Mod parameter list. Last edited by manolito; 23rd January 2020 at 18:08. |
|
23rd January 2020, 23:19 | #1149 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,823
|
Quote:
https://forum.doom9.org/showpost.php...postcount=4395 Anyway... there's definitely soft pulldown in that clip, unless I'm interpreting it wrong. Just not at the beginning, which is probably where MediaInfo looks. If you remux the sample as an MKV and extract the timecodes, you'll see they're 33ms apart at the beginning, which means either progressive, interlaced, or hard pulldown, all at 29.97fps. If you scroll down to frame number 652, the timecodes look like this: 22039 22089 22122 22172 22206 22256 22289 22339 22372 22422 22456 22506 The only explanation I can think of for the timecodes being 50ms apart is due to the repeat flags not being included in the timecodes. 1000 / (30000 / 1001) = 33.37 ms per frame. 1.5 frames = 50.05 ms But does that mean if you don't honour pulldown flags, instead of telecined frames at 33.37 ms per frame, or progressive frames at 41.7ms per frame (23.976fps) after decimation, you're getting frames alternating between 50ms and 33ms for an average of 41.5ms (roughly)? I hadn't thought about it like that until now. Mind if you do a VFR to CFR conversion with DSS or Repeat=true etc, it'd probably even them out to 33.37ms durations, with every forth frame repeated, or to an even 41.7ms per frame if you convert to 23.976fps. Last edited by hello_hello; 24th January 2020 at 10:00. |
|
24th January 2020, 11:21 | #1151 | Link | |
Registered User
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,078
|
Quote:
In most of my scripts with ffms2 and lsmash I do not even use the fpsnum and fpsden parameters, I use ChangeFPS(..) right after the source filter call. This method solves 2 problems in 1 step: First of all it does the VFR to CFR conversion (if applicable), and secondly the ChangeFPS command ensures linear access to the source filter. This is necessary for AVS+ in MT mode, because without linear access the speed of the source filter will drop to a snail's pace. |
|
25th January 2020, 07:31 | #1152 | Link | ||
Registered User
Join Date: Mar 2011
Posts: 4,823
|
Quote:
Quote:
I don't understand how ChangeFPS can convert VFR sources to a constant frame rate. Say you have a source that's a mixture of film (soft telecine) and video and it's 50-50, so the average frame rate is 26.938fps. Without frame rate conversion or honouring repeat flags, that's how ffms2 and lsmash decode it. The film sections would play too quickly and the video sections too slowly, and the A/V sync would be lost. If you use ChangeFPS to change the frame rate to 29.97fps (for example), it'll add frames to both the film and video sections to bring the frame rate up to 29.97fps, but they'll both still play at the wrong speed. The film sections would have less frames added than if they were being decoded at 23.976fps and the video sections would have frames added that wouldn't be if they were decoded at 29.97fps. An ffms2 or lsmash frame rate conversion at least ensures the extra frames are added where they're supposed to be, because they're aware of how the source frame rate varies. ChangeFPS isn't. I kind of get how ChangeFPS could ensure more linear frame requests from the source filter, but have you tried RequestLinear() from the TIVTC package? I think by default it requests frames 50 at a time from the upstream filter and caches 10 frames itself, although both can be adjusted. Last edited by hello_hello; 25th January 2020 at 07:42. |
||
25th January 2020, 15:49 | #1153 | Link | ||
Excessively jovial fellow
Join Date: Jun 2004
Location: rude
Posts: 1,100
|
Quote:
Quote:
As far as the discussion regarding honoring repeat field flags/soft pulldown or not, I basically think both sides are wrong. Honoring RFF's only really makes sense if you intend for the output to be interlaced, or if you're intending to deinterlace the entire stream because it's mostly interlaced content anyway and you can't be bothered to treat it as hybrid. In any other situation you're just doing a lot of pointless work by adding fields you're going to remove later anyway. Additionally, RFF's are a way of saying "look, this is actually progressive, but we have to pretend it's interlaced for compatibility reasons", so there's no reason to honor them if you don't actually want interlacing in the first place. However, in the context of Avisynth specifically, not honoring RFF's is kind of a pain in the ass. Avisynth doesn't have any per-frame metadata, not even timecodes, so there's no builtin way of signalling which frames are interlaced and which aren't, nor any way of dealing with VFR, so dealing with the hybrid content that disregarding RFF's is likely to result in requires significantly more manual effort than it does in e.g. Vapoursynth. So in that context honoring RFF's does make life easier. |
||
25th January 2020, 20:06 | #1154 | Link |
Registered User
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,078
|
So far everybody has been telling me that my way of dealing with NTSC clips with irregular pulldown or hybrid content will inevitably lead to A/V sync problems.
Then I asked (repeatedly) for a link to such a clip which would prove it. All I got was a link to a clip which is absolutely useless to judge A/V sync. Please let me ask for it again: I need a clip taken from a DVD which will show that my way of dealing with such clips is flawed. And which will hopefully allow me to find a way to deal with it without using DGIndex. |
25th January 2020, 22:32 | #1155 | Link |
Excessively jovial fellow
Join Date: Jun 2004
Location: rude
Posts: 1,100
|
I haven't had a DVD drive in my computer for years now and I've deleted most of the old stuff that could be interesting, so I can't help you there. The clip would have to be fairly long too to make audio desync really noticeable, a few minutes at least.
|
26th January 2020, 01:01 | #1156 | Link | |
Useful n00b
Join Date: Jul 2014
Posts: 1,667
|
Suppose a 29.97 stream's first half is video with no pulldown and the second half is pure soft 3:2 pulldown. If I understand correctly, the suggestion is to ignore pulldown and then apply ChangeFPS. But what does this really mean? You're going to have to specify some decimated frame rate, but you can't just assume 23.976 because that would be correct only for the second half. And we don't want any ChangeFPS action on the first half anyway. For totally irregular pulldown it could be any old crazy rate you would have to choose correctly if ChangeFPS is going to be any good.
I've also seen progressive transport streams with bursts of soft frame repeats in an irregular manner (presumably for compression of black or static video). How will your way deal with those? You have to honor the frame repeats. Quote:
Sure, implementing pulldown correctly and doing random access for it in a source filter is challenging and difficult, but that's not a good reason to do the wrong thing. Last edited by videoh; 26th January 2020 at 01:40. |
|
26th January 2020, 01:48 | #1157 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,823
|
Quote:
I don't know whether you've looked at the sample encodes I uploaded here, but the A/V sync is okay for all the samples in the zip file, but not the separate average frame rate encode. Another way to check the A/V sync is to open the original sample and an encoded version in different instances of MPC-HC and play them together. I have MPC-HC configured so I can pause and resume playback quickly by clicking on the video (I disable double clicking switching to full screen and use a middle click for that instead, because a double click for fullscreen is mental anyway). When they're running, it's then easy to quickly pause and resume playback for one instance of MPC-HC until the audio of both are in sync. Generally after a few fast double clicks on each instance of the player as required, you can sync the audio well enough to hear it start to phase cancel, rather than hearing a delay between them. With the audio in sync, the video should be too. I sometimes use that method to check the A/V sync for CFR encodes (compared to the source to make sure any audio delay is correct when I suspect it's not). If the audio is phasing and the scene changes happen at exactly the same time in both players, you know the A/V sync is the same. PS. If you try the MPC-HC method it's best to set "left down" for the play/pause option. If I remember correctly, "left up" lets you drag the player around by clicking on the video and holding the left mouse button down, and it can interfere with the ability to quickly pause and resume playback by clicking on the video. For "left down", the player doesn't have to try to guess what you're wanting to do when you click on the video. Last edited by hello_hello; 26th January 2020 at 03:49. |
|
26th January 2020, 02:59 | #1159 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,823
|
Quote:
If you open that sample with gMKVExtractGUI you'll see what's happening. If gMKVExtractGUI extracts the audio, it'll write a -631 delay to the extracted stream under the assumption the 631ms video delay will be lost during re-encoding. Most extraction programs would just write a zero delay. If you were to remux it with the original video while applying the -631ms delay, which would be a natural assumption, the A/V sync will change. Writing the audio delay relative to the video when extracting the audio was actually my idea, after realising a video delay isn't all that uncommon. The author of MKVCleaver kindly took it one step further so MKVCleaver can be configured to write both audio delays to the audio stream. The delay relative to the video and also the container audio delay. MediaInfo reporting a negative audio delay is a strong indication there's a positive video delay in an MKV, because for MKV there's no such thing as negative delays, but it's possible for the audio delay to be positive even when there's a video delay, if the audio delay is larger. Anyway, as gMKVExtractGUI shows you both the stream delays and the video/audio delays relative to each other, it's easy to work out what's going on. I've no idea how the different source filters handle that sort of thing when decoding the audio in the original container. They should start decoding where the video starts, but remuxing the sample as an MKV while applying a -631ms delay to both the audio and video will remove that variable. I use the original audio most of the time rather than re-encode it, or if I do encode it I extract it first anyway, so it's more important to know the relative delay than the container delay. The first delay for each stream is the container delay. The second delay for the audio stream shows their relative delays. The second delay for the video stream is the relative delay minus any audio container delay, I think. That one confuses me. Remuxed Remuxed while applying a -131ms delay to both streams Remuxed while applying a -631ms delay to both streams Here's what the help files say... FFMS2: int adjustdelay = -1 Controls how audio delay is handled, i.e. what happens if the first audio sample in the file doesn't have a timestamp of zero. The following arguments are valid: - **-3**: No adjustment is made; the first decodable audio sample becomes the first sample in the output. - **-2**: Samples are created (with silence) or discarded so that sample 0 in the decoded audio starts at time zero. - **-1**: Samples are created (with silence) or discarded so that sample 0 in the decoded audio starts at the same time as frame 0 of the first video track. This is the default, and probably what most people want. - **Any integer >= 0**: Same as -1, but adjust relative to the video track with the given track number instead. If the provided track number isn't a video track, an error is raised. -2 obviously does the same thing as -1 if the first video frame of the first video track starts at time zero. In some containers this will always be the case, in others (most notably 188-byte MPEG TS) it will almost never happen. Lsmash av_sync (default : false) Try Audio/Visual synchronization at the first video frame of the video stream activated in the index file if set to true. Last edited by hello_hello; 26th January 2020 at 04:13. |
|
26th January 2020, 13:49 | #1160 | Link |
Registered User
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,078
|
Yes, this is what I found out by trial and error a while ago when DVB-T2 was introduced in Germany (HEVC video, E-AC3 or AAC-LATM audio). The captured transport streams always have this huge audio delay, and for DSS2Mod and ffms2 these delays have to be ignored, for LSMASH the delays must be corrected.
So this is all caused by different default behavior of these source filters. Why in the world can the relevant source filter authors not agree on a uniform behavior? The problem is that I never use my source filters manually, they are called by a GUI where the call mostly is hard-coded in the GUI. Either users cannot edit this call at all, or it requires that the user edits the generated AVS script. This is not an option for most users. These GUIs are geared at average users, not at video specialists like you and DG. Since some of these GUIs are no longer maintained, it may be possible to use newer helper applications (like AVS filters or encoders), but only if the newer versions stick with their original defaults. This is the main reason why I always hate when the author of a tool which is widely used by other software suddenly decides that he wants to change defaults or calling conventions without any regard for backwards compatibility. MediaInfo is a good example, most GUIs which use it stopped working with versions after 18.5. And now the same thing happens with the latest LSMASH build by HolyWu. I really hate this, I call it "Apple Attitude". Last edited by manolito; 26th January 2020 at 13:58. |
Thread Tools | Search this Thread |
Display Modes | |
|
|