Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
13th September 2024, 18:34 | #1 | Link |
Registered User
Join Date: Jun 2022
Posts: 96
|
Opening AAC from MP4 file - what else besides LWLibas/LSMASH/ffms2?
I have an audio problem with a 4h MP4 file (H265 VFR + AAC) from a standalone recorder.
I open it in MPC-HC and at 4:00:59.500 I have synchronized video and audio. It looks and sounds OK. I open it in AviSynth using LWLibavVideoCodec: Code:
name1="VFR.mp4" a=AudioDub(LWLibavVideoSource(name1,fpsnum=50),LWLibavAudioSource(name1)) return a In ffms2 it is the same. (This is not a linear shift, because at time 0:13:00.000 I have only 1 frame of shift.) If, however, in VirtualDub, load the same MP4 file as an audio track (Audio->Audio from another file), everything is perfectly synchronized. And here's the question -- what else can I use to open audio from such a file? |
14th September 2024, 13:32 | #4 | Link |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,079
|
That is the problem with VFR video forced to be played at a fix (fpsnum=50) fps, that change the video duration and don't match with the audio duration.
What about: a=LWLibavAudioSource(name1) v=LWLibavVideoSource(name1).ChangeFPS(50) AudioDub(v,a) Now AviSynth "Changes the frame rate by deleting or duplicating frames."
__________________
BeHappy, AviSynth audio transcoder. |
14th September 2024, 16:17 | #5 | Link |
Registered User
Join Date: Jun 2022
Posts: 96
|
I'll check, but the problem is with the audio track, not the video track (LWLibavVideoSource handles VFR very well, unlike ffms2).
For now, I used ffmpeg and converted the clip twice -- once using "-c copy", the second time recoding it to mp4 CFR (h264+aac). In both cases ("copy" and "reencode"), playback in MPC-HC is correct as in the original (synchronization is preserved). If I load both of these files into VDub using Avisynth (as in the first post), the sound is shifted. (Also when playing AVS script via MPC-HC, the sound is shifted) If I overlay the audio track using Audio->Audio from other file and load the same file, the sound is correctly synchronized with the image (and this is regardless of whether I load the sound using "Caching Input Driver" or using "ffmpeg"). So it looks like the problem might be somewhere in AviSynth, because I don't believe that any plugin (LibavAudioSource, ffms2, BSAudioSource) can't handle audio (although all of them are probably based on ffmpeg libraries anyway -- but VDub with the same libraries handles sound correctly). Edit: LWLibavVideoSource(name1).ChangeFPS(50) works just as well as LWLibavVideoSource(name1,fpsnum=50) The video track remains correct, the audio track still generates a larger offset the longer it runs. Last edited by rgr; 14th September 2024 at 16:23. |
14th September 2024, 18:54 | #7 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,901
|
Quote:
rgr, I don't know if this is the reason as I don't know how the audio decoders deal with gaps, but if there's gaps in the audio and they're not accounted for.... Try extracting the audio with eac3to or MeGUI's HD Streams Extractor as eac3to will replace any gaps with silence. The golden rule I live by is to never extract the audio or other streams unless I'm converting/editing them. You can open most containers with MKVToolNixGUI and it'll remux without changing the timecodes for existing streams and therefore any gaps will be retained. I usually convert the video, open it in MKVToolNix, add the original source file, de-select the original video stream, then remux. All of the above is, as I said, just a guess as to whether it's the problem. PS. I should add, there can be issues when FPSNum is used if the specified frame rate is the same as the video frame rate in sections and/or it's largely CFR. Maybe it's no longer the case, but in the past I've seen both FFMS2 and LWLibavVideoSource drop frames they shouldn't have and replace them with duplicates, probably due to jitter in the timecodes or something like that. It doesn't happen often, and maybe not at all now, but there's always the option of converting to a CFR with an Avisynth plugin. It requires extracting the timecodes so the plugin knows where to add/drop frames. http://avisynth.nl/index.php/TimecodeFPS http://avisynth.nl/index.php/VfrToCfr As ChangeFPS(50) didn't seem to make any difference, you should confirm the video is actually VFR. Adding Info() to the script will give you the frame count and frame rate. If it's 50fps then FPSNum wouldn't be needed (unless you want to convert from one CFR to another, but there's better ways to do it). If not, then adding FPSNum would also change the frame count as it's very, very unlikely the average frame rate of a VFR video would exactly match a frame rate you've specified. Last edited by hello_hello; 14th September 2024 at 20:52. |
|
14th September 2024, 22:54 | #8 | Link | |
Registered User
Join Date: Mar 2017
Location: Germany
Posts: 251
|
Quote:
My feeling is, that there is no variable framerate, but some error - but only a feeling. |
|
15th September 2024, 00:01 | #9 | Link | |||
Registered User
Join Date: Jun 2022
Posts: 96
|
Quote:
Code:
eac3to.exe "VFR.mp4" Running in fast mode Keeping dialnorm The format of the source file could not be detected. Quote:
Quote:
|
|||
15th September 2024, 00:10 | #10 | Link | |
Registered User
Join Date: Jun 2022
Posts: 96
|
Quote:
As if playing the original stream from the MP4 file was OK, but each linear reencoding caused gradual shortening of the audio track. |
|
15th September 2024, 05:09 | #11 | Link |
Registered User
Join Date: Mar 2011
Posts: 4,901
|
"The format of the source file could not be detected."
Probably because it's an MP4. Remux it as an MKV and you should be okay. It sounds like VFR but only Info() in a script can tell you for sure. If it doesn't show 50fps then it's variable. Sometimes though, MediaInfo will report VFR when it's actually not, especially for MP4s. Something about the timebase some programs use if I remember correctly, and the frame timestamps don't always end up in the exact spot they should be. For MKV, as an example, timestamps are usually rounded to the nearest millisecond so for a frame rate such as 23.976 the frame duration should be 41.7083333333 seconds, but instead they alternate in a pattern between 41ms and 42ms. None of that should confuse MediaInfo, and MKVToolNix writes statistics about each track to tags in the MKV so MediaInfo can just read that info. MP4s also often seem to have an initial video delay as well as an audio delay though. I don't understand why, but maybe that confuses MediaInfo sometimes. It's probably VFR but I'd still get a second opinion myself. Edit: If ChangeFPS(50) is changing something then I guess it's not being decoded as 50fps so you probably don't need to confirm it with Info(). Last edited by hello_hello; 15th September 2024 at 05:16. |
15th September 2024, 12:03 | #12 | Link |
Registered User
Join Date: Jun 2022
Posts: 96
|
Info() will always return 50fps, because that's the average fps and with that AviSynth returns video as CFR.
https://imgur.com/a/GLy2DYu If it wasn't VFR, then fpsnum=50 would return the same video track. But that doesn't matter, because the problem isn't with the video track. eac3to v3.52 command line: eac3to.exe VFR.mkv out.wav ------------------------------------------------------------------------------ Running in fast mode Keeping dialnorm MKV, 1 video track, 1 audio track, 4:06:24, 49.998p 1: h265/HEVC, 576p50 (15:11) 2: AAC, 2.0 channels, 48kHz, dialnorm: 0dB [v01] The video bitstream framerate field doesn't match the container framerate. <WARNING> Track 2 is used for destination file "out.wav". [a02] Extracting audio track number 2... [a02] Decoding with DirectShow (Nero Audio Decoder 2)... [a02] Getting "Nero Audio Decoder 2" instance failed. <ERROR> Aborted at file position 3932160. <ERROR> Last edited by rgr; 15th September 2024 at 20:29. |
15th September 2024, 19:26 | #13 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,901
|
Quote:
For the record it's the source filter that chooses the frame rate as Avisynth, to the best of my knowledge, has no way to know what it is, and I assume ffms2 and Lsmash both calculate the average frame rate for VFR as it's the most logical choice. The frame rate in your screenshot appears to be 49.998 fps anyway, but for VFR video it'll change as the frame rate changes, although as it doesn't matter I'll leave it there. I'm not sure I've ever extracted AAC audio as a wave file, but I was just playing with eac3to and what I said about it repairing gaps no longer seems to be happening. I'm not sure why but I'll do my best to work it out as the alternative can only be be I've imagined it and I should be living in a room with padded walls. I'll have quite a few MeGUI log files on an old drive I can search through to prove to myself I haven't lost the plot, but I've used eac3to for that purpose many, many times, so I don't understand why it's not fixing them now. Last edited by hello_hello; 15th September 2024 at 19:29. |
|
17th September 2024, 13:15 | #16 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,079
|
Quote:
Try first extract the AAC with eac3to v3.36 (replace only the eac3to.exe in the eac3to install folder) to see if detect audio gaps. eac3to v3.36 command line: eac3to.exe VFR.mkv 2: out.aac maybe you can see: [a02] Audio has a gap of Xms at playtime 0:00:YY. <WARNING> Try with the extracted AAC like audio from other file in VirtualDub2
__________________
BeHappy, AviSynth audio transcoder. |
|
17th September 2024, 15:19 | #17 | Link |
Registered User
Join Date: Jun 2022
Posts: 96
|
Yes, there are gaps.
There are about 50 entries, which gives a total of about 300ms of shift. And that's right, that's about it. And what does that mean technically? I always thought that an audio track was a continuous "file" (mp3/aac/...) of course remuxed from video track. MKV, 1 video track, 1 audio track, 4:06:24, 49.998p 1: h265/HEVC, 576p50 (15:11) 2: AAC, 2.0 channels, 48kHz v01 The video bitstream framerate field doesn't match the container framerate. a02 Extracting audio track number 2... a02 Creating file "out.aac"... a02 Audio has a gap of 5ms at playtime 0:02:55. a02 Audio has a gap of 5ms at playtime 0:05:03. a02 Audio has a gap of 5ms at playtime 0:09:10. a02 Audio has a gap of 6ms at playtime 0:14:42. a02 Audio has a gap of 5ms at playtime 0:19:30. a02 Audio has a gap of 6ms at playtime 0:24:23. a02 Audio has a gap of 5ms at playtime 0:29:35. a02 Audio has a gap of 5ms at playtime 0:33:36. a02 Audio has a gap of 6ms at playtime 0:39:01. a02 Audio has a gap of 5ms at playtime 0:42:43. a02 Audio has a gap of 6ms at playtime 0:47:55. a02 Audio has a gap of 6ms at playtime 0:53:59. a02 Audio has a gap of 6ms at playtime 0:57:50. a02 Audio has a gap of 5ms at playtime 1:02:44. a02 Audio has a gap of 6ms at playtime 1:07:57. a02 Audio has a gap of 5ms at playtime 1:13:50. a02 Audio has a gap of 5ms at playtime 1:16:12. a02 Audio has a gap of 6ms at playtime 1:21:35. a02 Audio has a gap of 5ms at playtime 1:27:09. a02 Audio has a gap of 5ms at playtime 1:31:31. a02 Audio has a gap of 6ms at playtime 1:35:50. a02 Audio has a gap of 5ms at playtime 1:40:47. a02 Audio has a gap of 5ms at playtime 1:46:10. a02 Audio has a gap of 7ms at playtime 1:49:50. a02 Audio has a gap of 5ms at playtime 1:56:40. a02 Audio has a gap of 5ms at playtime 2:00:53. a02 Audio has a gap of 5ms at playtime 2:03:19. a02 Audio has a gap of 5ms at playtime 2:09:34. a02 Audio has a gap of 5ms at playtime 2:12:22. a02 Audio has a gap of 5ms at playtime 2:18:21. a02 Audio has a gap of 5ms at playtime 2:21:12. a02 Audio has a gap of 5ms at playtime 2:27:07. a02 Audio has a gap of 6ms at playtime 2:30:19. a02 Audio has a gap of 5ms at playtime 2:34:55. a02 Audio has a gap of 5ms at playtime 2:40:49. a02 Audio has a gap of 5ms at playtime 2:43:40. a02 Audio has a gap of 5ms at playtime 2:47:59. a02 Audio has a gap of 5ms at playtime 2:53:20. a02 Audio has a gap of 5ms at playtime 2:56:07. a02 Audio has a gap of 6ms at playtime 3:01:37. a02 Audio has a gap of 5ms at playtime 3:06:39. a02 Audio has a gap of 5ms at playtime 3:11:16. a02 Audio has a gap of 6ms at playtime 3:15:15. a02 Audio has a gap of 6ms at playtime 3:21:09. a02 Audio has a gap of 6ms at playtime 3:25:36. a02 Audio has a gap of 6ms at playtime 3:31:16. a02 Audio has a gap of 5ms at playtime 3:36:05. a02 Audio has a gap of 6ms at playtime 3:39:47. a02 Audio has a gap of 6ms at playtime 3:44:33. a02 Audio has a gap of 6ms at playtime 3:50:40. a02 Audio has a gap of 5ms at playtime 3:55:20. a02 Audio has a gap of 5ms at playtime 4:00:35. a02 Audio has a gap of 6ms at playtime 4:03:18. a02 Starting 2nd pass... a02 Extracting audio track number 2... a02 Realizing AAC gaps... a02 Creating file "out.aac"... eac3to processing took exactly 6 minutes. Done. Last edited by rgr; 17th September 2024 at 17:25. |
18th September 2024, 09:17 | #18 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,079
|
Quote:
A compressed audio is encoded like frames. Talking about standard AAC 48 KHz each frame have 1024 samples, in time it have a duration of 1024/48000 = 21.333 ms. In a container are stored video and audio frames, and each one have a "timestamp" than say the player when play that frame. In a CFR video 50 fps the timestamps show play a frame each 20 ms, but in a VFR that time can change. But there are also "timestamps" for each audio frame, if the first timestamp is 0 ms and the second at 26 ms there are a gap of 4.667 ms without sound because the first AAC frame finish at 21.333 ms. If eac3to make well their job "Realizing AAC gaps..." the out.aac must have the same video duration. If you extract the aac with: eac3to.exe VFR.mkv 2: out.aac -no2ndpass The gaps are ignored and the audio is short, like the decoded in Avs.
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 18th September 2024 at 09:20. |
|
20th September 2024, 13:51 | #19 | Link | |
Registered User
Join Date: Jun 2022
Posts: 96
|
Quote:
I have one more question -- is it possible to fill in these gaps using AviSynth (without using eac3to.exe)? As I understand it, LWLibavSource (like ffms2 or BestSource) uses ffmpeg libraries, so it all comes down to the question of whether there is such a possibility in the ffmpeg library? (Overall I think ignoring audio timestamps is a bug, after all they are there for a reason.) |
|
22nd September 2024, 11:02 | #20 | Link |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,079
|
Fill gaps with what?
If you fill with decoded silence at the midle of a strong sound, when you recode that you can obtain a click, because encoders must be initialized with the last value of previous frame. Support some clicks is ok but many... eac3to show 53 but only with 5 ms or more, sure there are much more short than 5 ms. If you want recode that file the unique solution is preserve the timestamps and aply them the the new audio (of course to video also). Check if your standalone recorder have other options to store your video/audio, or play it as is without try to manage it.
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 22nd September 2024 at 11:07. |
Tags |
audio shift, lwlibavaudiosource |
Thread Tools | Search this Thread |
Display Modes | |
|
|