Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > (HD) DVD, Blu-ray & (S)VCD > (HD) DVD & Blu-ray authoring
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th November 2014, 08:52   #61  |  Link
HPotter
Registered User
 
Join Date: Nov 2010
Posts: 4
I have some problem with decode text subtitles by textST.

After decode by textST
Code:
1
00:10:59,338 --> 00:11:03,091
 

2
00:11:39,044 --> 00:11:42,381
 

3
00:11:48,303 --> 00:11:53,308
Hei. 

4
00:11:53,308 --> 00:11:58,146
Mikset?
Must be:
Code:
1
00:01:10,530 --> 00:01:14,290
(Kuorsausta.)

2
00:01:50,240 --> 00:01:53,580
(Puhelin värisee.)

3
00:01:59,500 --> 00:02:04,500
Hei. <i>-Eve täällä, moi.
"Mä en pääse tänään tulemaan.</i>

4
00:02:04,500 --> 00:02:09,340
Mikset?
<i>-Se masentaa mua.</i>
It have many spaces and blank lines. And wrong timelines...
I attached original 00034.m2ts and .srt after textSP.
It text subtitles in UTF-8.
I need a help with this .
Attached Files
File Type: zip 00034.m2ts.zip (51.9 KB, 27 views)
HPotter is offline   Reply With Quote
Old 9th February 2015, 10:33   #62  |  Link
r0lZ
PgcEdit daemon
 
r0lZ's Avatar
 
Join Date: Jul 2003
Posts: 7,469
Quote:
Originally Posted by deank View Post
I wrote a small tool to process .pes files extracted with tsRemux or directly the original m2ts files with text subtitles...
[...]
textST (CLI) (630KB) self extractable 7z archive (contains calclib.dll and textST.exe), or if you're using multiAVCHD then you can download just the executable textST.exe (80kb) and put it in your multiAVCHD folder.
[...]
The "textST (CLI)" link is dead. (The link with textST.exe alone is still valid.) Can you upload the 7z archive again? Thanks!
__________________
r0lZ
PgcEdit homepage (hosted by VideoHelp)
BD3D2MK3D A tool to convert 3D blu-rays to SBS, T&B or FS MKV
r0lZ is offline   Reply With Quote
Old 9th February 2015, 12:51   #63  |  Link
DMD
Registered User
 
DMD's Avatar
 
Join Date: Jan 2006
Location: Italy
Posts: 260
With the availability of the necessary files (calclib.dll, readSUP.exe and textST.exe) I created the file commandline.bat, very elementary.
Which extracts the text file from the stream m2ts, just enter the stream only interested without the extension.







If you are interested in the file Text Tool stream.rar is this
http://www.mediafire.com/download/zl...eams+V.ENG.rar

The file must be decompressed into the folder rar STREAM of Blu-ray disc and launch the file commandline.bat

greetings

Last edited by DMD; 9th February 2015 at 13:01.
DMD is offline   Reply With Quote
Old 9th February 2015, 13:04   #64  |  Link
r0lZ
PgcEdit daemon
 
r0lZ's Avatar
 
Join Date: Jul 2003
Posts: 7,469
Thanks.
Despite its name, it seems that the archive includes the Italian version of the shell script. Not a problem for me.
__________________
r0lZ
PgcEdit homepage (hosted by VideoHelp)
BD3D2MK3D A tool to convert 3D blu-rays to SBS, T&B or FS MKV
r0lZ is offline   Reply With Quote
Old 9th February 2015, 13:48   #65  |  Link
DMD
Registered User
 
DMD's Avatar
 
Join Date: Jan 2006
Location: Italy
Posts: 260
MakeMKV with the latest version (1.9.1) is already compatible with this type of subtitles.
So if you make the language selection automatically are also selected subtitles text format of the same language.



Since MakeMKV already allows the language selection
If there was a tool to extract directly the subtitle TextST, we should not look for him among the many in the STREAM folder of Blu-ray.

For this reason it would be better to perfect tool MKVExtractGUI2

Last edited by DMD; 9th February 2015 at 13:51.
DMD is offline   Reply With Quote
Old 20th January 2016, 19:20   #66  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Analysing the TextST format

Hello everyone,

I stumpled over this post when I tried to convert my "The Mummy" Blu-Ray into a MKV file using MakeMKV. Of course the TextST files are not playable anywhere (VLC, ...) so I thought of converting them to SSA/AAS subtitles. In addition to just SRT I thought it would be possible to add some sort of formatting or even screen position which may also be present in the original source data. Since I could not find the source code available I started some reverse engeneering on the format. Here is what I go so far:



I used mkvextract to extract the TextST streams from the MakeMKV output MKV file. I used the --fullraw flag on all streams, so maybe some of the header data in the sample is also due to this.

I also tried looking for any specification of the format in the Internet but could not find anything helpful.

Your thoughts/input is highly appreciated.

Kind regards,
Simon
__________________

CU DVD

Last edited by DVD; 20th January 2016 at 19:24.
DVD is offline   Reply With Quote
Old 21st January 2016, 11:35   #67  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Subtitle position

Hello guys,

I did some further analysis, this time with "The Mummy Returns". As it turns out, some of the subtitles are at the top of the screen (to not overlap with hard subtitle text on the movie-screen) and some are at the bottom. I marked the differences in blue. I think they could impact the position in some way:

__________________

CU DVD
DVD is offline   Reply With Quote
Old 23rd January 2016, 00:33   #68  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
Top and Bottom are defined in RegionStyles
in example above
Top is RegionStyleID 01
Bottom is RegionStyleID 00
PTS are 5 bytes each
and so on

I gathered some info about BD textsubstreams, should help

Code:
Dialog Style
 segment type				08	= 0x81 
 length of Dialog Styles		16
 Player Own Style			08	= 0x00 prohibit
						  0x80 permit
 Number of Region Styles		08
 Number of User Styles			08
Region Info
  region_style_id			08				
  region_horizontal_position 		16
  region_vertical_position		16
  region_width				16
  region height				16
  region_bg_color_palette_id		08
	reserved			08
Textbox Info
  text_box_horizontal_position		16
  text_box_vertical_position		16	
  text_box_width			16
  text_box_height			16
  text_flow				08
  text_horizontal_alignment		08	= 0x01 horizontal writing right
						  0x02 horizontal writing left
						  0x03 vertical writing
  text_vertical_alignment		08	= 0x01 left
						  0x02 center
						  0x03 right
  line_space				08
  font_id				08
  font_style				08	= 0x00 normal
						  0x01 bold
						  0x02 italic
						  0x03 bold+italic
						  0x04 outline
						  0x05 outline+bold
						  0x06 outline+italic
						  0x07 outline+bold+italic
  font_size				08      = 0x08 to 0x90
  font_palette_entry			08
  outline_palette_entry			08
  outline_size				08	= 0x01 thin
						  0x02 medium
						  0x03 thick
		.....next region_style_id......			

Palette
  length				16
    palette_entry_id			08
    Y_value				08
    Cr_value				08
    Cb_value				08
    T_value				08
    ....
    ....
    (last palette_entries are always 254 -> FE)

Dialog Presentation Segment
 #_of_dialog_presentation_segments	16
 segment_type				08	= 0x82 
 reserved				08
 segment_length				08
 dialog_start_time		        40
 dialog_end_time		        40
 palette_update_flag			01	if set 
 reserved				07	then
			.............Palette..........
 numbers_of_regions			08
 continous_present_flag			01
 forced_flag				01
 reserved				06
 region_style_id			08

Text Subtitle
 text_subtitle_length			16
 escape_code				08	= always 0x1B
 data_type				08	= 01 Text string start 	
						  02 Change font set
						  03 Change font style
						  04 Change font size
						  05 Change font color
						  0A Line break
						  0B End of inline style
 data_length				08	= data_typ 01 -> length of text string
						  data_typ 02 -> 0x01
						  data_typ 03 -> 0x03
						  data_typ 04 -> 0x01 
						  data_typ 05 -> 0x01
						  data_typ 0A -> 0x00
						  data_typ 0B -> 0x00
....
....
....

Last edited by bigotti5; 23rd January 2016 at 09:06.
bigotti5 is offline   Reply With Quote
Old 24th January 2016, 23:10   #69  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Hello,

Thank you very much. This really helped. I am able to parse the header successfully now. I will post my Java code here once my parser/converter is done. I will try to support SSA/ASS output as well as SRT.

There is one thing I noticed with your code: It seems to be off by one byte at the beginning. I have:

Code:
Identifier:         1 Byte (always 0x81) [position confirmed]
Section length:     2 Byte [position confirmed]
(unknown):          1 Byte
Player Own Style:   1 Byte [???]
# of Region Styles: 1 Byte [position confirmed]
# of User Styles:   1 Byte [???]
...
If I align it like this, it parses nicely (width, height, color all make sense). If I align it like in your code (skip the (unknown) Byte marked in red), all other information is rubbish (width = 65000, ...)

I will continue work and post updates here.

Thanks for your support.



Kind regards,
DVD
__________________

CU DVD
DVD is offline   Reply With Quote
Old 25th January 2016, 00:49   #70  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
Yes, mistake in my records - should be

Code:
Dialog Style
 segment type				08	= 0x81 
 length of Dialog Styles		16
 Player Own Style			01	= 0x00 prohibit
						  0x80 permit
 reserved                               15
 Number of Region Styles		08
 Number of User Styles			08
...
...
thx
bigotti5 is offline   Reply With Quote
Old 25th January 2016, 03:27   #71  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
I did some corrections to my records
- user styles missing
- wrong text flow entries
Code:
Dialog Style
 segment type				08	= 0x81 (Dialog Style)
 length of Dialog Styles		16
 Player Own Style			01	= 0x00 prohibit
						  0x80 permit
 reserved				15
 Number of Region Styles		08
 Number of User Styles			08
Region Info
  region_style_id			08				
  region_horizontal_position 		16
  region_vertical_position		16
  region_width				16
  region height				16
  region_bg_color_palette_id		08
  reserved			        08
Textbox Info
  text_box_horizontal_position		16
  text_box_vertical_position		16	
  text_box_width			16
  text_box_height			16
  text_flow				08	= 0x01 horizontal writing right
						  0x02 horizontal writing left
						  0x03 vertical writing 
  text_horizontal_alignment		08	= 0x01 left
						  0x02 center
						  0x03 right
  text_vertical_alignment		08	= 0x01 top
						  0x02 middle
						  0x03 bottom
  line_space				08
  font_id				08
  font_style				08	= 0x00 normal
						  0x01 bold
						  0x02 italic
						  0x03 bold+italic
						  0x04 outline
						  0x05 outline+bold
						  0x06 outline+italic
						  0x07 outline+bold+italic
  font_size				08
  font_palette_entry			08
  outline_palette_entry			08
  outline_size				08	= 0x01 thin
						  0x02 medium
						  0x03 thick
     User changeable style set			  if Number of User Styles != 0
	user_style_id			08
	reg_horiz_pos_direction		01	= 0 right
						  1 left
	reg_horiz_pos_delta		15
	reg_verti_pos_direction		01	= 0 down
						  1 up
	reg_verti_pos_delta		15
	font_size_inc_dec		01	= 0 increase
						  1 decrease
	font_size_delta			07
	txtbox_hor_pos_dir		01	= 0 right
						  1 left
	txtbox_hor_pos_delta		15
	txtbox_vert_pos_dir		01	= 0 down
						  1 up
	txtbox_vert_pos_delta		15
	txtbox_width_inc_dec		01	= 0 increase
						  1 decrease
	txtbox_width_delta		15
	txtbox_height_inc_dec		01	= 0 increase
						  1 decrease
	txtbox_height_delta		15
	line_space_inc_dec		01	= 0 increase
						  1 decrease
	line_space_delta		07

		.....next region_style_id......

Palette
  length				16
    palette_entry_id			08
    Y_value				08
    Cr_value				08
    Cb_value				08
    T_value				08
    ....
    ....
    (last palette_entries are always 254 -> FE)

Dialog Presentation Segment
 #_of_dialog_presentation_segments	16
 segment_type				08	= 0x82 
 reserved				08
 segment_length				08
 dialog_start_PTS			40
 dialog_end_PTS 			40
 palette_update_flag			01	if set 
 reserved				07	then
			.............Palette..........
 numbers_of_regions			08
 continous_present_flag			01
 forced_flag				01
 reserved				06
 region_style_id			08

Text Subtitle
 text_subtitle_length			16
 escape_code				08	= always 0x1B
 data_type				08	= 01 Text string start 	
						  02 Change font set
						  03 Change font style
						  04 Change font size
						  05 Change font color
						  0A Line break
						  0B End of inline style
 data_length				08	= data_typ 01 -> length of text string
						  data_typ 02 -> 0x01
						  data_typ 03 -> 0x03
						  data_typ 04 -> 0x01
						  data_typ 05 -> 0x01
						  data_typ 0A -> 0x00
						  data_typ 0B -> 0x00

Last edited by bigotti5; 25th January 2016 at 14:45. Reason: typos
bigotti5 is offline   Reply With Quote
Old 25th January 2016, 19:26   #72  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Parser/Converter for TextST Subtitles

Hello everyone,

The first version of my parser/converter written in Java is complete. Currently it supports parsing of TextST streams based on the raw data and export to SRT files.

Here is how it works:

1.) Use MakeMKV to extract the video, audio and subtitle information into an MKV file.
2.) Use mkvextract (part of mkvtoolnix) with the MKV file created in 1.) and the following commandline to demux the raw TextST data for all TextST stream IDs:

Code:
mkvextract tracks <MKV Source File> --fullraw <First ID>:<Filename ID 1>.<Extension> --fullraw <Second ID>:<Filename ID 2>.<Extension> ... --fullraw <Last ID>:<Filename ID n>.<Extension>
3.) Compile and use my parser (source code attached) with the following command line:

Code:
java -jar SubConverter.jar -d=<Source Folder> -e=<Extension> -o=SRT -v
This will parse/convert all files in the folder "<Source Folder>" with the file extension "<Extension>".
-o=SRT provide SRT files as output
-o=SSA/ASS will SOON provide SubStation Alpha files as output (work in progress)
-v will output the parsed data on the console

Please note that the code does not come with any form of warranty! I tried this with the German and English TextST subtitles on the Mummy and the Mummy Returns and it worked quite well so far.

Feedback highly appreciated!

Special thanks to bigotti5 for providing input on the data format!
Attached Files
File Type: txt SubConverter v0.10 alpha.txt (32.4 KB, 41 views)
__________________

CU DVD

Last edited by DVD; 1st February 2016 at 18:32. Reason: Minor additions
DVD is offline   Reply With Quote
Old 25th January 2016, 19:34   #73  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
Attachments must be approved by an admin - this can take a while
upload your files to an external hoster and provide the link
bigotti5 is offline   Reply With Quote
Old 25th January 2016, 19:47   #74  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
I will see what I can do about the attachment.

BTW: There still seem to be some glitches in your format specification. I worked around them based on my findings. Not sure if it works for all files eventually but for mine it did ...
__________________

CU DVD
DVD is offline   Reply With Quote
Old 25th January 2016, 19:50   #75  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Oh, and something else comes to mind: Do you have an idea what "Font ID" is. What font is it referring to?
__________________

CU DVD
DVD is offline   Reply With Quote
Old 25th January 2016, 20:21   #76  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
refers to BDMV/AUXDATA font files
ID 0 = 00000.otf
ID 1 = 00001.otf
...
bigotti5 is offline   Reply With Quote
Old 26th January 2016, 13:13   #77  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
Quote:
There still seem to be some glitches in your format specification
Can you be more precise?
bigotti5 is offline   Reply With Quote
Old 26th January 2016, 13:51   #78  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Sure (as far as I can tell):

Quote:
#_of_dialog_presentation_segments 16
I have my doubt that this exists. In my code I read from the palette info directly to "segment_type".

Quote:
.....next region_style_id......
In your code this is AFTER "User Styles". However I would assume that first we have the "region styles" [1-n] and then the "user styles" [1-n]. Just an assumption.

Quote:
....
....
(last palette_entries are always 254 -> FE)
Not sure what ... ... is. In my parser I read entry id plus four values per color

Code:
  length				16
    palette_entry_id			08
    Y_value				08
    Cr_value				08
    Cb_value				08
    T_value				08
    .... next palette entry ...
__________________

CU DVD
DVD is offline   Reply With Quote
Old 26th January 2016, 19:04   #79  |  Link
bigotti5
Spielberger
 
bigotti5's Avatar
 
Join Date: Feb 2005
Posts: 838
Last palette
of course 5 bytes, my comment should mean that palette entry 254 is always present at last
If there are only e.g. 3 colors IDs are 01, 02 and 254

Quote:
However I would assume that first we have the "region styles" [1-n] and then the "user styles" [1-n]. Just an assumption.
Definitly not

Quote:
In my code I read from the palette info directly to "segment_type".
In my test encodes from scenarist there is always the number of dps before first 0x82

here a sample log from commercial TextST parser (DVDLogic BDReauthor pro)
Code:
...
...    Textbox height delta: 0
    Line space inc dec: 0 - Increase
    Line space delta: 0


  Palette
   Length: 20
   Entry ID: 0, Y: 32, Cr: 118, Cb: 240, T: 0
   Entry ID: 1, Y: 235, Cr: 128, Cb: 128, T: 255
   Entry ID: 2, Y: 16, Cr: 128, Cb: 128, T: 255
   Entry ID: 254, Y: 16, Cr: 128, Cb: 128, T: 0



Number of Dialog Presentation Segments: 2
Dialog Presentation Segment 1
 Segment descriptor
  Segment type: 0x82 - Dialog Presentation Segment
  Segment length: 63

 Dialog start PTS: 54450000 - 00:10:05.000
 Dialog end PTS: 54720000 - 00:10:08.000
 Palette update flag: 0
 Number of Regions: 1
 Dialog region 1
  Continuous present flag: 0
  Forced on flag: 0
  Region style id ref: 0
  Subtitle data length: 0
...
...
bigotti5 is offline   Reply With Quote
Old 26th January 2016, 21:22   #80  |  Link
DVD
Registered User
 
DVD's Avatar
 
Join Date: Oct 2002
Posts: 72
Okay, I will update the stuff with the User Style.

As for the other items, in the files I have neither the color with index 254 is present nor the number of dialog presentation segments.
__________________

CU DVD
DVD is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:04.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.