Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Programming and Hacking > Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 8th December 2011, 06:53   #521  |  Link
VzK
Registered User
 
Join Date: Jan 2010
Posts: 9
LoRd_MuldeR, thanks for the exhaustive answer.

One more thing:


I read that foreign characters are supported, did I do something wrong?
»single flac to various mp3 » import cue sheet » write meta info to encoded files option, no m3u file generated. Used v4.04 Alpha-7.
VzK is offline   Reply With Quote
Old 8th December 2011, 13:34   #522  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Could you upload/attach the problematic CUE file please as-is?

LameXP handles Unicode characters properly, as long as the input is in the proper UTF-8 format. I suspect that this is not the case here

Currently LameXP will assume that the input is UTF-8, iff a BOM is found. Otherwise it will interpret the input with your local 8-Bit Codepage - whatever that may be.

I know this this isn't very reliable, but storing text files with some local 8-Bit Codepage is extremely error-prone anyway, because for the app reading the file later, it will be impossible to know the correct Codepage. Assuming that the "local" Codepage is the right one, is just a wild guess. But probably the best we can do.

Using Unicode with proper UTF-8 encoding is the right way to store text files that contain "foreign" (Non-ASCII) characters! And a BOM should be prepended to indicate that this is UTF-8. Still the BOM is optional. Assuming that all text without a BOM is not UTF-8, is another wild guess. But again I don't know a better way...

(BTW: As for the console, you won't see proper Unicode output there anyway, unless you change font to "Lucidia Console")


[Small Update]

I hacked together a quick workaround that may work for your case:
http://www.mediafire.com/file/pqfj5o....Build-803.exe

If LameXP assumes that the input is not UTF-8, it will now test whether decoding the input with the "local 8-Bit" Codepage results in any '�' (U+FFFD) characters. In that particular case we can assume that the "local 8-Bit" Codepage is not the suitable one and thus we will fall back to the "Latin-1" Codepage. There is absolutely no guarantee that the Latin-1 Codepage will work any better than the local one. It's just another try (and will succeed with Western European encodings).
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 8th December 2011 at 15:19.
LoRd_MuldeR is offline   Reply With Quote
Old 8th December 2011, 15:41   #523  |  Link
VzK
Registered User
 
Join Date: Jan 2010
Posts: 9



Original cue attached if you want to give a look anyway.

Great work LoRd_MuldeR! Thanks a lot!
Attached Files
File Type: zip cue.zip (712 Bytes, 12 views)
VzK is offline   Reply With Quote
Old 8th December 2011, 15:58   #524  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
As I had assumed, your CUE file is not Unicode/UTF-8, but plain Latin-1

On an English or German version of Windows (actually many more), the Latin-1 Codepage is configured as the "local" Codepage by default.

That's why, on such systems, decoding the file with the "local" Codepage gives the desired result. And one might assume that this is always the case.

But it's not! Other localized versions of Windows have a different Codepage configured by default, giving potentially very different output

Solution: Open the CUE file in Notepad++, change the Encoding to the suitable one (here Latin-1) and then convert to UTF-8 (Unicode), with BOM.
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 8th December 2011 at 16:00.
LoRd_MuldeR is offline   Reply With Quote
Old 9th December 2011, 01:41   #525  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
LameXP v4.04 Alpha-7.1:

Quote:
Changes between v4.03 and v4.04:
* Added support for the QAAC Encoder, requires QuickTime v7.7.1 or newer (see FAQ doc for details)
* Updated LAME encoder to v3.99.2 Final (2011-11-18), compiled with ICL 12.1.7 and MSVC 10.0 (details)
* Updated MediaInfo to v0.7.51+ (2011-11-19), compiled with ICL 12.1.6 and MSVC 10.0
* Implemented coalescing of update signals to reduce the CPU usage of the LameXP process (details)
* Run more than four instances in parallel on systems with more than four CPU cores (details)
* Improved handling of different character encodings for Playlist and Cue Sheet import
* Workaround for a bug that causes MediaInfo to not detect the duration of Wave files (64-Bit only)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 10th December 2011 at 19:43.
LoRd_MuldeR is offline   Reply With Quote
Old 9th December 2011, 12:34   #526  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 1,046
I wanted to ask if you are going to implement an option for audio alignment. I had a .wav, encoded with LameXP to AAC, decoded to WAV back again, and compared both WAV in Audacity, and the second one had around 36ms positive delay. I didn't know this so everything I have done til now might be wrong
I was also looking for a list to know which encoders/decoders you use for each format.

I gathered some links of interest.

Questions about H.264 sync
http://forum.doom9.org/showthread.php?t=156163
NeroAacEnc and delay added during encoding
http://forum.doom9.org/showthread.php?t=144346
audio delay with mp4box
http://forum.doom9.org/showthread.php?t=145435
Possible bug in nero encoder?
http://forum.doom9.org/showthread.php?p=1420989
Audio Sync for MP3 and AAC files in AVISynth
http://forum.doom9.org/showthread.php?p=1426169
Dogway is offline   Reply With Quote
Old 9th December 2011, 14:26   #527  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
The extra delay is a common "problem" with (lossy) audio compression formats.

One reason for this is that these formats usually work with fixed-size (in samples) "frames", which they will transform into the frequency domain.

But, because of the properties of the transform, overlapped transforms must be used and a so-called "window" function must be applied on the samples first.

And for that reason, some extra samples before and after each frame are required. But before the very first frame, starting at the very first sample, there are none!

That's why the encoder will prepend some "silent" samples in front of the very first input sample. And also append some after the last input sample.

In the MP3 format, there is no "official" way to indicate how many extra samples were added, in order to have the decoder remove them. LAME has it's own way.

(Whether an individual MP3 decoder will respect LAME's extra info header, that's a different question ^^)

See also:
* http://cas.web.cern.ch/cas/Denmark-2...%20CAS2010.pdf
* http://lame.sourceforge.net/tech-FAQ.txt
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 9th December 2011 at 17:10.
LoRd_MuldeR is offline   Reply With Quote
Old 9th December 2011, 15:44   #528  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 1,046
So good, so it is just as padding in video processing. Now I understood better but the question is, will you add an option for this?
And the next question, where can I see all the modules used for each format in LameXP in order to know the possible settings, flaws, pros, cons, etc whynots.
Dogway is offline   Reply With Quote
Old 9th December 2011, 16:38   #529  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Quote:
Originally Posted by Dogway View Post
So good, so it is just as padding in video processing. Now I understood better but the question is, will you add an option for this?
Add an option for what?

If an audio encoder prepends samples to the source, it does this for good reason. It's a consequence of how audio compression works (seep previous post).

There is no simple option to turn it off

It is possible to cut off the padded samples later, after decoding. But only if the decoder can know how many samples had been added by the encoder.

And this is a decoder feature, which is out of scope for LameXP.

Unless, of course, we are talking about the decoding of the sources (original files) that you feed into LameXP for conversion.

Actually I'm not quite sure whether the individual decoders used in LameXP, e.g. mpg123 for MP3 files or FAAD for AAC files, do remove padding, if possible.

(At least for MP3 files encoded by LAME it is possible to accurately remove the padded samples at playback/decoding-time)

Quote:
Originally Posted by Dogway View Post
And the next question, where can I see all the modules used for each format in LameXP in order to know the possible settings, flaws, pros, cons, etc whynots.
What do you consider a module?

Anyway, all code of LameXP can be found at its Git repository:
https://github.com/lordmulder/LameXP
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 9th December 2011 at 17:19.
LoRd_MuldeR is offline   Reply With Quote
Old 9th December 2011, 18:16   #530  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 1,046
Option to fix it. In post-processing for example, just an idea.

You don't know how the input was encoded, but you know how you are going to decode it (to wav) and encode it again. So there's room to make things nice in LameXP scope. I'm not sure, I just got to know about this bug (I consider it a bug) but I think the delay is a constant value so with a few tests you can get to know. But it's up to you to decide if this is LameXP scope, right now I think I can't use it anymore for my videos audio. Now it's only useful for music audio, and individual tracks of course, if you encode a session composed of tracks LameXP won't be suited either...
Fortunately I now know about it, and can use workarounds, many people will still encode delayed audio without knowing aka doing the wrong thing.

In video processing we also add some padding for some filters to comply the mod 16 requirement, or just to protect borders. But the padding is always cropped back after filtering.


I think you call them Tools.
https://github.com/lordmulder/LameXP...ster/res/tools

You have some kind of a list more towards suppported formats rather than used tools for input/output formats
http://gitorious.org/lamexp/lamexp/b...AQ.html#line39

That can make the cut if I were to test for delay issues and flag options.
Dogway is offline   Reply With Quote
Old 9th December 2011, 19:49   #531  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Quote:
Option to fix it. In post-processing for example, just an idea.
No, the "encoder delay" is not a bug. And thus there is no "fix" per se. As explained before, the padding is a direct consequence of how audio compression works. Let's not overstate things

And to make that clear: Any MP3 encoder (e.g. LAME) will cause an "encoder delay", though the delay may differ between different encoders. Also the front-end (e.g. LameXP) has no influence on that.

The problem with MP3 in particular is that there is no "official" way to indicate the amount of padding that had been added by the encoder, so the decoder can't remove it!

AFAIK in the design of Vorbis this has been handled much better, i.e. the "encoder delay" is stored in the stream in a standardized way and thus it will be removed properly by the decoder.

I don't know what the situation with AAC is, but I guess it is more or less the same as with MP3...

Some more info:
http://en.wikipedia.org/wiki/Gapless_playback


Quote:
You don't know how the input was encoded, but you know how you are going to decode it (to wav) and encode it again. So there's room to make things nice in LameXP scope. I'm not sure, I just got to know about this bug (I consider it a bug) but I think the delay is a constant value so with a few tests you can get to know. But it's up to you to decide if this is LameXP scope, right now I think I can't use it anymore for my videos audio. Now it's only useful for music audio, and individual tracks of course, if you encode a session composed of tracks LameXP won't be suited either...
Fortunately I now know about it, and can use workarounds, many people will still encode delayed audio without knowing aka doing the wrong thing.
The processing chain is as follows:
Decode input -> Apply filters (if any) -> Encode output

The "encoder delay" is added in the very last step, so it is impossible to implement any workaround, obviously. Except having LAME add it's header to the MP3 file - which it does by itself.

Actually I just made a quick test and, indeed, the 'mpg123' decoder will respect the LAME header. Thus, if a LAME header is present, the padding samples are removed

If you encode a Wave file to MP3 with LameXP (and thus with LAME) and then decode that MP3 file to Wave again with LameXP (and thus with mpg123), the sample count remains unchanged!

You can try yourself with:
Code:
lame.exe -V2 uncompressed.wav compressed.mp3
mpg123.exe -v -w decompressed.wav compressed.mp3
You will see that "uncompressed.wav" and "decompressed.wav" will have to exactly same size in bytes. And in the audio editor you can check that they are perfectly aligned.

That's as much as we can do

(If some other decoder will be used to decode the MP3 files produced by LameXP/LAME and that decoder ignores the LAME header, then we cannot do anything about that)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 10th December 2011 at 14:48.
LoRd_MuldeR is offline   Reply With Quote
Old 10th December 2011, 19:42   #532  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
LameXP v4.04 Alpha-8:
http://sourceforge.net/projects/lame...29/2011-12-10/

Quote:
Changes between v4.03 and v4.04:
* Added support for the QAAC Encoder, requires QuickTime v7.7.1 or newer (see FAQ doc for details)
* Updated LAME encoder to v3.99.2 Final (2011-11-18), compiled with ICL 12.1.7 and MSVC 10.0 (details)
* Updated MediaInfo to v0.7.51+ (2011-11-19), compiled with ICL 12.1.6 and MSVC 10.0
* Implemented coalescing of update signals to reduce the CPU usage of the LameXP process (details)
* Run more than four instances in parallel on systems with more than four CPU cores (details)
* Improved handling of different character encodings for Playlist and Cue Sheet import
* Workaround for a bug that causes MediaInfo to not detect the duration of Wave files (64-Bit only)
Found a bug that caused LameXP to never use the local 8-Bit Codepage when importing a Cue Sheet or Playlist. Instead UTF-8 was tried twice

This has been fixed. Also, when importing a Cue Sheet that is not UTF-8 with a proper BOM, LameXP will now allow the user to choose the desired 8-Bit Codepage.

For everybody who doesn't know what that means: You can simply keep "(System Default)" and click 'OK' when the Codepage dialog pops up
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 10th December 2011 at 20:08.
LoRd_MuldeR is offline   Reply With Quote
Old 10th December 2011, 22:43   #533  |  Link
VzK
Registered User
 
Join Date: Jan 2010
Posts: 9
Updated.

Found a typo:
VzK is offline   Reply With Quote
Old 10th December 2011, 23:15   #534  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Thank you. Fixed
https://github.com/lordmulder/LameXP...92e5bd1#diff-9

BTW, I added some info about the "encoder delay" to the F.A.Q document:
http://lamexp.git.sourceforge.net/gi...=HEAD#c6d9dfed
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 11th December 2011 at 14:06.
LoRd_MuldeR is offline   Reply With Quote
Old 11th December 2011, 19:41   #535  |  Link
Reimar
Registered User
 
Join Date: Jun 2005
Posts: 278
Quote:
Originally Posted by LoRd_MuldeR View Post
Using Unicode with proper UTF-8 encoding is the right way to store text files that contain "foreign" (Non-ASCII) characters! And a BOM should be prepended to indicate that this is UTF-8. Still the BOM is optional. Assuming that all text without a BOM is not UTF-8, is another wild guess. But again I don't know a better way...
I don't think that is correct, a BOM for UTF-8 to my knowledge is not only not necessary but actually invalid (it certainly does not serve as BOM, there is no byte order to mark for UTF-8).
In addition UTF-8 can be autodetected quite reliably, if you can parse it correctly as UTF-8 and you test text contains characters outside the ASCII range (>127) it almost certainly is UTF-8.
Reimar is offline   Reply With Quote
Old 11th December 2011, 20:04   #536  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 1,046
Quote:
Originally Posted by LoRd_MuldeR View Post
lol you never sounded so receptive when I noted similar typos... just curious : P

Quote:
Originally Posted by Dogway View Post
-Typo in Show Dropbox in Spanish would be "Mostrar DropBox"
Quote:
Originally Posted by LoRd_MuldeR View Post
Feel free to update the language file according to the translator's guide:
http://mulder.brhack.net/public/doc/...translate.html
Dogway is offline   Reply With Quote
Old 11th December 2011, 20:36   #537  |  Link
mariush
Registered User
 
Join Date: Dec 2008
Posts: 590
Quote:
Originally Posted by Reimar View Post
I don't think that is correct, a BOM for UTF-8 to my knowledge is not only not necessary but actually invalid (it certainly does not serve as BOM, there is no byte order to mark for UTF-8).
In addition UTF-8 can be autodetected quite reliably, if you can parse it correctly as UTF-8 and you test text contains characters outside the ASCII range (>127) it almost certainly is UTF-8.
http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

0xEF,0xBB,0xBF is a valid combination to represent UTF-8 content.

Quote:
The UTF-8 representation of the BOM is the byte sequence 0xEF,0xBB,0xBF. A text editor or web browser interpreting the text as ISO-8859-1 or CP1252 will display the characters  for this.

The Unicode Standard does permit the BOM in UTF-8,[2] but does not require or recommend its use.[3] Byte order has no meaning in UTF-8[4] so in UTF-8 the BOM serves only to identify a text stream or file as UTF-8.

One reason the UTF-8 BOM is not recommended is that many pieces of software without Unicode support nevertheless are able to handle UTF-8 inside a text but not at the start of a text. For instance, the bytes of UTF-8 can be placed between the quotes of string constants in many programming languages, and that language will write the correct UTF-8 to a file or to a display, despite the language not knowing anything about UTF-8. This provides an easy migration path to convert systems to Unicode and to remove all legacy encodings, without simultaneously upgrading the programming language. The unexpected three bytes of the BOM break this however, as they are located where they are certain to be a syntax error.

A leading BOM can also defeat software that uses pattern matching on the start of a text file, since it inserts 3 bytes before the pattern. Though commonly associated with the Unix shebang at the start of an interpreted script,[5] the problem is more widespread. For instance in PHP, the existence of a BOM will cause the page to begin output before the initial code is interpreted, causing problems if the page is trying to send custom HTTP headers (which must be set before output begins).

Many Windows programs (including Windows Notepad) add BOMs to UTF-8 files by default[citation needed].
mariush is offline   Reply With Quote
Old 11th December 2011, 20:43   #538  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
@Reimar:
Yes, the BOM symbol in UTF-8 does not serve to indicate the Byte Order (there is no Byte Order in single-byte sequences, yes) and it's optional. Still it's a valid character and often present. So if an UTF-8 BOM is found, we can assume that the text is UTF-8 indeed. The probability to find a UTF-8 BOM sequence just "by chance" is negligible. In the other case, when there is no UTF-8 BOM present, the text may still be valid UTF-8. But it maybe "some" local 8-Bit codepage just as well. It may not even be encoded in the Windows ANSI Codepage that happens to be configured on the individual computer. For all these reasons, if no UTF-8 BOM is found, LameXP will now pop up a small dialog, allowing the user to select the desired Codepage.

@Dogway:
I'm always thankful for bug-reports, including typos. But, as a matter of fact, I can only update the English and German translations. For all other languages I have to rely on other people to send my updated/corrected language files
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 11th December 2011 at 20:52.
LoRd_MuldeR is offline   Reply With Quote
Old 11th December 2011, 20:45   #539  |  Link
mariush
Registered User
 
Join Date: Dec 2008
Posts: 590
Mulder, ideally you should show a window with a preview updated automatically when use selects a different codepage from a drop down list...
mariush is offline   Reply With Quote
Old 11th December 2011, 20:54   #540  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196
Quote:
Originally Posted by mariush View Post
Mulder, ideally you should show a window with a preview updated automatically when use selects a different codepage from a drop down list...
Ideally I should. Practically I think that is "nice to have" but over the top
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.


LoRd_MuldeR is offline   Reply With Quote
Reply

Tags
aac, aotuv, flac, lame, lamexp, mp3, mp4, ogg, oggenc, opus, vorbis

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:18.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.