Video is a discrete medium, frames appear at specific times and stay for a specific while, therefore you can change quite easily their display duration; but audio is a rather continuous medium, changing the playing time (with or without preserving the pitch doesn't matter much) is always a substantial change of the whole content, usually requires decoding, resampling, and new encoding. There may be a quite simple AC3 encoder in eac3to, but E-AC3 is not supported, AFAIK.
|