Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
3rd January 2021, 18:12 | #1301 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
That's right. In this thread last June, we made it clear. During the first few lines, nOCR learns to recognize
the correct size of symbols further. Maybe I am wrong, but I think that in such a situation the first scan can always be wrong, only with the second scan nOCR will use the previously acquired skills stored in our database.
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 4th January 2021 at 01:01. |
4th January 2021, 08:44 | #1302 | Link |
Registered User
Join Date: Feb 2004
Location: Mars
Posts: 428
|
@Janusz: thx for the feedback
I've tried to fix the bug about updating nOcrDb with new casing here: https://github.com/SubtitleEdit/subt...leEditBeta.zip How does that work? The beta can also read the weird .sup file (transport stream subtitle). About the screenshot of main window - it's just the frame that's been removed, right? |
4th January 2021, 14:12 | #1303 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
@Nikse555
Many thanks for another quick fix in the program. So far, I have checked the files I used earlier on submission. The fix works fine. The very first scan with a verified character database gives an error-free result in all cases. Another, new file during OCR showed an error of the type: upper / lower case replacement. However, after correcting - assigning the correct letter to its symbol in the database, the first OCR works flawlessly.
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 4th January 2021 at 14:18. |
6th January 2021, 15:37 | #1305 | Link | ||
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
Quote:
Quote:
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 6th January 2021 at 20:33. |
||
6th January 2021, 22:25 | #1306 | Link | |
Registered User
Join Date: Dec 2020
Posts: 6
|
Quote:
And I will write the latter in Polish because maybe the translator translated it wrong Chodzi mi o Narzędzia -> Popraw najczęstsze błędy... i tam była opcja usuń trzy kropki z początku wiersza. A teraz tego nie ma. |
|
7th January 2021, 00:40 | #1307 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
@deusexe
You can find this in: Tools/Fix common errors - now as: Fix continuation style: ... You can find out what and how it will be included in the [Edit settings for fixing continuation style ...] button in Option/Settings/Tools.
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 7th January 2021 at 00:54. |
8th January 2021, 17:18 | #1310 | Link |
Registered User
Join Date: Dec 2020
Posts: 6
|
However, I looked wrong. The continuation style is not what I'm looking for at all. This does not find any already existing dots at the beginning of a sentence, it just creates them!
Where, how can you turn off the Continuation style? |
8th January 2021, 23:18 | #1311 | Link | |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
@deuexe
Disabling this option, as @Nikse555 replied above, does not create new ellipsis. It also does not remove ellipsis from the original text. Enabling this option fixes the continuation style by adding/removing ellipsis where the assumed logic requires it. Quote:
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 8th January 2021 at 23:40. |
|
9th January 2021, 11:41 | #1312 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
@Nikse555
Last NEXT, beta 525 creates an srt file with errors in the nOCR process: The word "Wjakim" should be split into "W jakim" (The whole sentence: In what sense.) as in the NEXT, beta 487 version below: And compare srt files. Correct text on the left. This is what the section of my pol_OCRFixReplaceList.xml file responsible for this correction looks like. Code:
<PartialWords> <!-- Will be used to check words not in dictionary. If new word(s) and longer than 4 chars and exists in spelling dictionary, it is (or they are) accepted --> <WordPart from="~~" to="I" /> <!-- "f " will be two words --> <WordPart from="~~f" to="f " /> <WordPart from="ą" to="ą " /> <WordPart from="j" to=" j" /> <WordPart from="W" to="W " /> <WordPart from="w" to="w " /> </PartialWords>
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 10th January 2021 at 16:04. |
10th January 2021, 15:54 | #1313 | Link |
Registered User
Join Date: Feb 2004
Location: Mars
Posts: 428
|
@Janusz: Tried to fix the ""Wjakim" issue in latest beta: https://github.com/SubtitleEdit/subt...leEditBeta.zip
|
10th January 2021, 16:04 | #1314 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
Thanks, @Nikse555.
I just downloaded the NEXT beta 566 and it looks like it's fine and working as it should again.
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 10th January 2021 at 17:14. |
11th January 2021, 20:04 | #1315 | Link |
Registered User
Join Date: Mar 2009
Location: Germany
Posts: 5,769
|
Well,
I used the latest (well, at the time ) version, to convert from SRT (Unicode) to BD SUP, and the result was very acceptable. The only issue is that there was a noticeable gap (white space, blank) between any italic or bold characters and the rest of the subtitle. Otherwise, splendid job. Would it be possible to add another checkbox near the "forced" to allow changes in format for the ticked subtitles (like a bigger font, for movie title or a different placement etc)? I mean all subtitles share the common format except for the ticked ones that may have their own? Just a suggestion...
__________________
Born in the USB (not USA) |
12th January 2021, 16:10 | #1316 | Link |
Registered User
Join Date: Dec 2013
Posts: 631
|
Hi Nikse
SUPs used on UHD-BD's with HDR video-content are authored darker. It often happens that when home-created SUPs are muxed together with HDR video, during playback they appear too bright. Sometimes a UHD (BD-)player has extra settings to compensate for this brightness, but not all (if not most of them) don't. Without having to OCR and re-export, is there a way in SE to only edit the palette of a SUP file? |
12th January 2021, 21:37 | #1317 | Link | ||
Registered User
Join Date: Feb 2004
Location: Mars
Posts: 428
|
Quote:
Quote:
Last edited by Nikse555; 12th January 2021 at 21:42. |
||
12th January 2021, 21:40 | #1318 | Link | |
Registered User
Join Date: Feb 2004
Location: Mars
Posts: 428
|
Quote:
(File -> Import -> Blu-ray (.sup) subtitle file for edit... - and then "Tools" or "list view context menu with selected lines") |
|
13th January 2021, 06:05 | #1319 | Link |
Registered User
Join Date: Apr 2020
Location: Poland
Posts: 143
|
@Nikse555
Note: the described case is general in nature and also occurs in earlier versions of the program. The animation shows how during nOCR, after enabling the [Use color] option, the images were split incorrectly, and also not split although they should be because they contain two lines of text in different colors. With the ts source file I could get color text, but this is impossible in this case as only with Greyscale=on and Use color=off can I get error free text for the whole file. The archive contains 4 sup files to trace the program operation saved with different option settings and the srt files obtained from them. "g" in the file name means [Grayscale] 0/1 - off/on, "c" in the file name means [Use color] 0/1 - off/on Perhaps this is an isolated case and should not be dealt with, but if we can fix it... thank you in advance. Do you ever anticipate changing the text color separately for the top and bottom lines in one image (as two combined images)?
__________________
Sorry for my mistakes - I'm using a translator. Last edited by Janusz; 13th January 2021 at 08:19. |
Thread Tools | Search this Thread |
Display Modes | |
|
|