View Single Post
Old 17th March 2020, 17:04   #835  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 353
How can I use portable MPC-HC with Subtitle Edit? MPC-HC option is greyed out in Settings.

Feature requests:
  1. [Options - Settings... - Word lists] Double click on a pair in OCR fix list fills the fields beside 'Add pair' button with the corresponding values.
  2. [Options - Settings... - Tools - Fix common OCR errors - also use hard-coded rules] Make using hard-coded rules customizable. For example, replacing 'l' between uppercase letters with 'I' is surely needed while converting the first letter of the paragraph to uppercase may be completely unwanted.
  3. [Import/OCR Blu-ray (.sup) subtitle file...] Add the ability to disable spell checking while still using OCR fix list for selected language. This makes sense because some errors like "l instead of I after the dot" aren't being fixed by OCR fix list and hence force spell checking dialog to appear. But they may be fixed by applying a regexp in an external editor (for the mentioned error it would be "(?<!\w)l(?!\w)"). Applying regexps before spell checking saves a lot of time but to use regexps currently we need to OCR without error fixing and then call Fix common errors tool with only 'Fix common OCR errors (using OCR replace list)' option checked.
  4. Add the ability to use regexps to fix common OCR errors. It would be great to create a predefined set of regexps like the one above. I'm ready to share my own.
  5. Look for Settings.xml in the current (working) directory (i. e. directory that was the current when SE was laucnhed) instead of the directory where SubtitleEdit.exe is located. It would allow to have different settings for different cases or users.
Bugs (Import/OCR Blu-ray subtitle, OCR method: Binary image compare, Image database: Latin):
  1. Non-Italic dashes that are followed by Italic text are erroneously recognized as Italic (example). Also another bug with this example subpic: if Dictionary field is empty then the space after "Audience" is lost; if Dictionary is set to English then the space is preserved.
  2. "t ]" in Italic is recognized as "t]" with default "8 pixels is space" (example).
  3. '9' is recognized as '0' (example).
  4. Jumping to a subpic by typing its number in 'Subtitle text' area isn't working for #555: typing '5' repeatedly moves the cursor from #50 to #51, then to #52... #59, then to #500 and so on.
  5. 'Fix OCR errors' checkbox state isn't being saved.
OCR fix list (English):
  1. Why default OCR fix list includes "backseat -> back seat"? Is "backseat" really incorrect?
  2. Why default OCR fix list includes "lt -> it"? I believe it should be "lt -> It". The same thing with various lf-started pairs: one part of them has "If" as a result (which is correct), another part has "if" as a result (which is not).
__________________
Windows 8.1 x64

Magically yours
Raistlin

Last edited by GCRaistlin; 18th March 2020 at 16:55.
GCRaistlin is offline   Reply With Quote