Quote:
Originally Posted by Janusz
Beta 129.
For this example I created a sup file from the text " the tragedy ofAlabama" where "the tragedy of" I marked italic.
I only installed the following dictionaries: French, German, Italian, English without additional OCRFixReplaceList.xml files.
For: French, German, Italian, English - the patch works ok. The text after OCR looks like this: "<i>the tragedy of</i> Alabama",
for: Polish and "none" like this: "<i>the tragedy</i> ofAlabama".
I did not check others, but I think the amendment should work in all languages because the word "ofAlabama" is not correct in any language, and any division in this case may occur between italics/regular or regular/italics always regardless of the language chosen how many new words exist in the selected dictionary. Example from Poland: "fotografAdam" (photographer Adam).
|
Yes, the OCR process benefits from a good OCR fix replace list.
I've added a Polish one based on your input here:
https://github.com/SubtitleEdit/subt...eplaceList.xml
Feel free to add to it