Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
7th November 2014, 08:52 | #61 | Link |
Registered User
Join Date: Nov 2010
Posts: 4
|
I have some problem with decode text subtitles by textST.
After decode by textST Code:
1 00:10:59,338 --> 00:11:03,091 2 00:11:39,044 --> 00:11:42,381 3 00:11:48,303 --> 00:11:53,308 Hei. 4 00:11:53,308 --> 00:11:58,146 Mikset? Code:
1 00:01:10,530 --> 00:01:14,290 (Kuorsausta.) 2 00:01:50,240 --> 00:01:53,580 (Puhelin värisee.) 3 00:01:59,500 --> 00:02:04,500 Hei. <i>-Eve täällä, moi. "Mä en pääse tänään tulemaan.</i> 4 00:02:04,500 --> 00:02:09,340 Mikset? <i>-Se masentaa mua.</i> I attached original 00034.m2ts and .srt after textSP. It text subtitles in UTF-8. I need a help with this . |
9th February 2015, 10:33 | #62 | Link | |
PgcEdit daemon
Join Date: Jul 2003
Posts: 7,469
|
Quote:
__________________
r0lZ PgcEdit homepage (hosted by VideoHelp) BD3D2MK3D A tool to convert 3D blu-rays to SBS, T&B or FS MKV |
|
9th February 2015, 12:51 | #63 | Link |
Registered User
Join Date: Jan 2006
Location: Italy
Posts: 260
|
With the availability of the necessary files (calclib.dll, readSUP.exe and textST.exe) I created the file commandline.bat, very elementary.
Which extracts the text file from the stream m2ts, just enter the stream only interested without the extension. If you are interested in the file Text Tool stream.rar is this http://www.mediafire.com/download/zl...eams+V.ENG.rar The file must be decompressed into the folder rar STREAM of Blu-ray disc and launch the file commandline.bat greetings Last edited by DMD; 9th February 2015 at 13:01. |
9th February 2015, 13:04 | #64 | Link |
PgcEdit daemon
Join Date: Jul 2003
Posts: 7,469
|
Thanks.
Despite its name, it seems that the archive includes the Italian version of the shell script. Not a problem for me.
__________________
r0lZ PgcEdit homepage (hosted by VideoHelp) BD3D2MK3D A tool to convert 3D blu-rays to SBS, T&B or FS MKV |
9th February 2015, 13:48 | #65 | Link |
Registered User
Join Date: Jan 2006
Location: Italy
Posts: 260
|
MakeMKV with the latest version (1.9.1) is already compatible with this type of subtitles.
So if you make the language selection automatically are also selected subtitles text format of the same language. Since MakeMKV already allows the language selection If there was a tool to extract directly the subtitle TextST, we should not look for him among the many in the STREAM folder of Blu-ray. For this reason it would be better to perfect tool MKVExtractGUI2 Last edited by DMD; 9th February 2015 at 13:51. |
20th January 2016, 19:20 | #66 | Link |
Registered User
Join Date: Oct 2002
Posts: 72
|
Analysing the TextST format
Hello everyone,
I stumpled over this post when I tried to convert my "The Mummy" Blu-Ray into a MKV file using MakeMKV. Of course the TextST files are not playable anywhere (VLC, ...) so I thought of converting them to SSA/AAS subtitles. In addition to just SRT I thought it would be possible to add some sort of formatting or even screen position which may also be present in the original source data. Since I could not find the source code available I started some reverse engeneering on the format. Here is what I go so far: I used mkvextract to extract the TextST streams from the MakeMKV output MKV file. I used the --fullraw flag on all streams, so maybe some of the header data in the sample is also due to this. I also tried looking for any specification of the format in the Internet but could not find anything helpful. Your thoughts/input is highly appreciated. Kind regards, Simon
__________________
CU DVD Last edited by DVD; 20th January 2016 at 19:24. |
21st January 2016, 11:35 | #67 | Link |
Registered User
Join Date: Oct 2002
Posts: 72
|
Subtitle position
Hello guys,
I did some further analysis, this time with "The Mummy Returns". As it turns out, some of the subtitles are at the top of the screen (to not overlap with hard subtitle text on the movie-screen) and some are at the bottom. I marked the differences in blue. I think they could impact the position in some way:
__________________
CU DVD |
23rd January 2016, 00:33 | #68 | Link |
Spielberger
Join Date: Feb 2005
Posts: 838
|
Top and Bottom are defined in RegionStyles
in example above Top is RegionStyleID 01 Bottom is RegionStyleID 00 PTS are 5 bytes each and so on I gathered some info about BD textsubstreams, should help Code:
Dialog Style segment type 08 = 0x81 length of Dialog Styles 16 Player Own Style 08 = 0x00 prohibit 0x80 permit Number of Region Styles 08 Number of User Styles 08 Region Info region_style_id 08 region_horizontal_position 16 region_vertical_position 16 region_width 16 region height 16 region_bg_color_palette_id 08 reserved 08 Textbox Info text_box_horizontal_position 16 text_box_vertical_position 16 text_box_width 16 text_box_height 16 text_flow 08 text_horizontal_alignment 08 = 0x01 horizontal writing right 0x02 horizontal writing left 0x03 vertical writing text_vertical_alignment 08 = 0x01 left 0x02 center 0x03 right line_space 08 font_id 08 font_style 08 = 0x00 normal 0x01 bold 0x02 italic 0x03 bold+italic 0x04 outline 0x05 outline+bold 0x06 outline+italic 0x07 outline+bold+italic font_size 08 = 0x08 to 0x90 font_palette_entry 08 outline_palette_entry 08 outline_size 08 = 0x01 thin 0x02 medium 0x03 thick .....next region_style_id...... Palette length 16 palette_entry_id 08 Y_value 08 Cr_value 08 Cb_value 08 T_value 08 .... .... (last palette_entries are always 254 -> FE) Dialog Presentation Segment #_of_dialog_presentation_segments 16 segment_type 08 = 0x82 reserved 08 segment_length 08 dialog_start_time 40 dialog_end_time 40 palette_update_flag 01 if set reserved 07 then .............Palette.......... numbers_of_regions 08 continous_present_flag 01 forced_flag 01 reserved 06 region_style_id 08 Text Subtitle text_subtitle_length 16 escape_code 08 = always 0x1B data_type 08 = 01 Text string start 02 Change font set 03 Change font style 04 Change font size 05 Change font color 0A Line break 0B End of inline style data_length 08 = data_typ 01 -> length of text string data_typ 02 -> 0x01 data_typ 03 -> 0x03 data_typ 04 -> 0x01 data_typ 05 -> 0x01 data_typ 0A -> 0x00 data_typ 0B -> 0x00 .... .... .... Last edited by bigotti5; 23rd January 2016 at 09:06. |
24th January 2016, 23:10 | #69 | Link |
Registered User
Join Date: Oct 2002
Posts: 72
|
Hello,
Thank you very much. This really helped. I am able to parse the header successfully now. I will post my Java code here once my parser/converter is done. I will try to support SSA/ASS output as well as SRT. There is one thing I noticed with your code: It seems to be off by one byte at the beginning. I have: Code:
Identifier: 1 Byte (always 0x81) [position confirmed] Section length: 2 Byte [position confirmed] (unknown): 1 Byte Player Own Style: 1 Byte [???] # of Region Styles: 1 Byte [position confirmed] # of User Styles: 1 Byte [???] ... I will continue work and post updates here. Thanks for your support. Kind regards, DVD
__________________
CU DVD |
25th January 2016, 00:49 | #70 | Link |
Spielberger
Join Date: Feb 2005
Posts: 838
|
Yes, mistake in my records - should be
Code:
Dialog Style segment type 08 = 0x81 length of Dialog Styles 16 Player Own Style 01 = 0x00 prohibit 0x80 permit reserved 15 Number of Region Styles 08 Number of User Styles 08 ... ... |
25th January 2016, 03:27 | #71 | Link |
Spielberger
Join Date: Feb 2005
Posts: 838
|
I did some corrections to my records
- user styles missing - wrong text flow entries Code:
Dialog Style segment type 08 = 0x81 (Dialog Style) length of Dialog Styles 16 Player Own Style 01 = 0x00 prohibit 0x80 permit reserved 15 Number of Region Styles 08 Number of User Styles 08 Region Info region_style_id 08 region_horizontal_position 16 region_vertical_position 16 region_width 16 region height 16 region_bg_color_palette_id 08 reserved 08 Textbox Info text_box_horizontal_position 16 text_box_vertical_position 16 text_box_width 16 text_box_height 16 text_flow 08 = 0x01 horizontal writing right 0x02 horizontal writing left 0x03 vertical writing text_horizontal_alignment 08 = 0x01 left 0x02 center 0x03 right text_vertical_alignment 08 = 0x01 top 0x02 middle 0x03 bottom line_space 08 font_id 08 font_style 08 = 0x00 normal 0x01 bold 0x02 italic 0x03 bold+italic 0x04 outline 0x05 outline+bold 0x06 outline+italic 0x07 outline+bold+italic font_size 08 font_palette_entry 08 outline_palette_entry 08 outline_size 08 = 0x01 thin 0x02 medium 0x03 thick User changeable style set if Number of User Styles != 0 user_style_id 08 reg_horiz_pos_direction 01 = 0 right 1 left reg_horiz_pos_delta 15 reg_verti_pos_direction 01 = 0 down 1 up reg_verti_pos_delta 15 font_size_inc_dec 01 = 0 increase 1 decrease font_size_delta 07 txtbox_hor_pos_dir 01 = 0 right 1 left txtbox_hor_pos_delta 15 txtbox_vert_pos_dir 01 = 0 down 1 up txtbox_vert_pos_delta 15 txtbox_width_inc_dec 01 = 0 increase 1 decrease txtbox_width_delta 15 txtbox_height_inc_dec 01 = 0 increase 1 decrease txtbox_height_delta 15 line_space_inc_dec 01 = 0 increase 1 decrease line_space_delta 07 .....next region_style_id...... Palette length 16 palette_entry_id 08 Y_value 08 Cr_value 08 Cb_value 08 T_value 08 .... .... (last palette_entries are always 254 -> FE) Dialog Presentation Segment #_of_dialog_presentation_segments 16 segment_type 08 = 0x82 reserved 08 segment_length 08 dialog_start_PTS 40 dialog_end_PTS 40 palette_update_flag 01 if set reserved 07 then .............Palette.......... numbers_of_regions 08 continous_present_flag 01 forced_flag 01 reserved 06 region_style_id 08 Text Subtitle text_subtitle_length 16 escape_code 08 = always 0x1B data_type 08 = 01 Text string start 02 Change font set 03 Change font style 04 Change font size 05 Change font color 0A Line break 0B End of inline style data_length 08 = data_typ 01 -> length of text string data_typ 02 -> 0x01 data_typ 03 -> 0x03 data_typ 04 -> 0x01 data_typ 05 -> 0x01 data_typ 0A -> 0x00 data_typ 0B -> 0x00 Last edited by bigotti5; 25th January 2016 at 14:45. Reason: typos |
25th January 2016, 19:26 | #72 | Link |
Registered User
Join Date: Oct 2002
Posts: 72
|
Parser/Converter for TextST Subtitles
Hello everyone,
The first version of my parser/converter written in Java is complete. Currently it supports parsing of TextST streams based on the raw data and export to SRT files. Here is how it works: 1.) Use MakeMKV to extract the video, audio and subtitle information into an MKV file. 2.) Use mkvextract (part of mkvtoolnix) with the MKV file created in 1.) and the following commandline to demux the raw TextST data for all TextST stream IDs: Code:
mkvextract tracks <MKV Source File> --fullraw <First ID>:<Filename ID 1>.<Extension> --fullraw <Second ID>:<Filename ID 2>.<Extension> ... --fullraw <Last ID>:<Filename ID n>.<Extension> Code:
java -jar SubConverter.jar -d=<Source Folder> -e=<Extension> -o=SRT -v -o=SRT provide SRT files as output -o=SSA/ASS will SOON provide SubStation Alpha files as output (work in progress) -v will output the parsed data on the console Please note that the code does not come with any form of warranty! I tried this with the German and English TextST subtitles on the Mummy and the Mummy Returns and it worked quite well so far. Feedback highly appreciated! Special thanks to bigotti5 for providing input on the data format!
__________________
CU DVD Last edited by DVD; 1st February 2016 at 18:32. Reason: Minor additions |
25th January 2016, 19:47 | #74 | Link |
Registered User
Join Date: Oct 2002
Posts: 72
|
I will see what I can do about the attachment.
BTW: There still seem to be some glitches in your format specification. I worked around them based on my findings. Not sure if it works for all files eventually but for mine it did ...
__________________
CU DVD |
26th January 2016, 13:51 | #78 | Link | |||
Registered User
Join Date: Oct 2002
Posts: 72
|
Sure (as far as I can tell):
Quote:
Quote:
Quote:
Code:
length 16 palette_entry_id 08 Y_value 08 Cr_value 08 Cb_value 08 T_value 08 .... next palette entry ...
__________________
CU DVD |
|||
26th January 2016, 19:04 | #79 | Link | ||
Spielberger
Join Date: Feb 2005
Posts: 838
|
Last palette
of course 5 bytes, my comment should mean that palette entry 254 is always present at last If there are only e.g. 3 colors IDs are 01, 02 and 254 Quote:
Quote:
here a sample log from commercial TextST parser (DVDLogic BDReauthor pro) Code:
... ... Textbox height delta: 0 Line space inc dec: 0 - Increase Line space delta: 0 Palette Length: 20 Entry ID: 0, Y: 32, Cr: 118, Cb: 240, T: 0 Entry ID: 1, Y: 235, Cr: 128, Cb: 128, T: 255 Entry ID: 2, Y: 16, Cr: 128, Cb: 128, T: 255 Entry ID: 254, Y: 16, Cr: 128, Cb: 128, T: 0 Number of Dialog Presentation Segments: 2 Dialog Presentation Segment 1 Segment descriptor Segment type: 0x82 - Dialog Presentation Segment Segment length: 63 Dialog start PTS: 54450000 - 00:10:05.000 Dialog end PTS: 54720000 - 00:10:08.000 Palette update flag: 0 Number of Regions: 1 Dialog region 1 Continuous present flag: 0 Forced on flag: 0 Region style id ref: 0 Subtitle data length: 0 ... ... |
||
|
|