Doom9's Forum - View Single Post

Nikse555 · 21st May 2020, 16:50

Quote:

Originally Posted by Janusz

Beta 129.
For this example I created a sup file from the text "the tragedy ofAlabama" where "the tragedy of" I marked italic.

I only installed the following dictionaries: French, German, Italian, English without additional OCRFixReplaceList.xml files.
For: French, German, Italian, English - the patch works ok. The text after OCR looks like this: "<i>the tragedy of</i> Alabama",
for: Polish and "none" like this: "<i>the tragedy</i> ofAlabama".
I did not check others, but I think the amendment should work in all languages because the word "ofAlabama" is not correct in any language, and any division in this case may occur between italics/regular or regular/italics always regardless of the language chosen how many new words exist in the selected dictionary. Example from Poland: "fotografAdam" (photographer Adam).

Yes, the OCR process benefits from a good OCR fix replace list.
I've added a Polish one based on your input here: https://github.com/SubtitleEdit/subt...eplaceList.xml
Feel free to add to it