Doom9's Forum - View Single Post

mkver · 14th August 2018, 21:25

Quote:

Originally Posted by hubblec4

Have you used a "-1" value for TimeStampScale in your test? And if so, which TimeStampScale was set by mkvmerge?

For 48kHz content (the most common sample rate for audio accompanying movies) the TimestampScale is 20832; 20833 would be enough for sample accuracy.

Quote:

Originally Posted by hubblec4

For me is accuracy more important then save disk space. And Mosu's test shows that 4.2mb more overhead is almost nothing in relation to the file size.

There might be an unpleasant surprise waiting for you, namely the way mkvmerge assignes timestamps from input files. It simply trusts them and this might sound good, but it has some side-effects. By "trusts them" I mean that the timestamp of the i. frame (frame[i] - 0 based array) in a lace is set equal to Timestamp((Simple)Block) + sum the durations of all the frame[j] with j<i. This calculation is done in ns precision. Whereas the durations are fine (the error that is introduced by using ns is usually negligible; there is only one scenario in which the durations are bad: if a track is not supported (if it is supported, then the durations are directly taken from the bitstream level) and if there is no default duration for the track), the Timestamp of the (Simple)Block is not. They are after all already rounded, for a TimestampScale of 1000000 they are rounded to ms.
Here is mkvinfo's output for a file with TimestampScale 1000000 containing a DTS soundtrack (packet duration 32/3 ms):

Code:

| + Simple block: key, track number 1, 8 frame(s), timestamp 00:00:00.000000000
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
| + Simple block: key, track number 1, 8 frame(s), timestamp 00:00:00.085000000
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024

If one remuxes this file with the automatically choosen TimestampScale amounting to sample precision, one gets this:

Code:

|+ Cluster
| + Cluster timestamp: 00:00:00.000000000
| + Simple block: key, track number 1, 8 frame(s), timestamp 00:00:00.000000000
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
| + Simple block: key, track number 1, 8 frame(s), timestamp 00:00:00.084994560
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024
|  + Frame with size 1024

8*32/3ms = 85 1/3ms, i.e. the first lace ends at about 85 1/3 ms, but the timestamp of the second lace is very near to 85ms and not to 85 1/3ms as the timestamp is derived as above and then converted to the new TimestampScale of 20832. This is actually worse than it was before: Earlier the "incorrect" value of 85ms could be blamed upon the timestamp resolution, but this time it isn't. Put another way: With 1ms precision it could be disregarded that there is actually an 1/3ms overlap between the first two laces; now this is no longer true. The file is claming to be more precise than it actually is.

There are other ways to run into this issue: Remux a file without changing the TimestampScale, but with changing the lacing (can be done by either disabling-lacing or by changing the clustering (e.g. use a different video track that has keyframes at different points)) and you get something like this (for disable-lacing and still 1ms precision, remuxed from the first file above):

Code:

| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.000000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.011000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.021000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.032000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.043000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.053000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.064000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.075000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.085000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.096000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.106000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.117000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.128000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.138000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.149000000
|  + Frame with size 1024
| + Simple block: key, track number 1, 1 frame(s), timestamp 00:00:00.160000000
|  + Frame with size 1024

The frame with timestamp 106ms should actually have a timestamp of 106 2/3ms which should be rounded to 107ms.

A strategy to solve this problem would go like this: For the first (first/next/last etc. always refers to output order (established from the timestamps directly read from the file)) (Simple)Block of a track the timestamp in the file is trusted. Then the end timestamp of said (Simple)Block is calculated (if possible; if not, one has to trust the timestamps anyway). The next timestamp read is then compared to the end timestamp of the last one. If they agree within the bounds of the precision possible by the input file, then it is presumed that there is no gap between these two frames and instead of the timestamp of the second block the end timestamp of the first block is used. If not, then the timestamp taken from the input file is directly used. This logic is similar to the one used to prevent shifting gaps due to lacing.

If someone wants several blocks to overlap (like the 1/3ms above), then this algorithm would obviously destroy this and "correct" the file. But apart from this case (that seems to be totally unlikely) I can't think of a scenario where it would make matters worse.