Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Display Modes
Old 25th January 2026, 05:39   #61  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Yes, it should be around that.
Hi Voodoo,

Well, I have been wasting my time trying to setup various Whisper versions, and even after following the YouTube instructions as best I can, the end result is pretty much the same, they don't work

And it's hard to get support when a lot of the clips are 12 months old, or older....

So I think that you have proven to me what your build can do, I should just "bite the bullet", and donate to you, to get the Pro version.

I'm pretty sure you will provide good support, if I have any issues, or need some experienced assistance.

So I just want to confirm, that the minimum "donation" to be eligible for Pro is £50, is that correct ??

Regards
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 25th January 2026, 08:07   #62  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
Hi Voodoo,

Well, I have been wasting my time trying to setup various Whisper versions, and even after following the YouTube instructions as best I can, the end result is pretty much the same, they don't work
Hi. That's why I made it, because the original Whisper and other implementations weren't good enough for me.


Quote:
Originally Posted by TR-9970X View Post
So I think that you have proven to me what your build can do, I should just "bite the bullet", and donate to you, to get the Pro version.

I'm pretty sure you will provide good support, if I have any issues, or need some experienced assistance.

So I just want to confirm, that the minimum "donation" to be eligible for Pro is £50, is that correct ??
The Pro version has some extra features, if you want them, it's £50 at the moment.
VoodooFX is offline   Reply With Quote
Old 25th January 2026, 09:17   #63  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by voodoofx View Post
hi. That's why i made it, because the original whisper and other implementations weren't good enough for me.




The pro version has some extra features, if you want them, it's £50 at the moment.
sold !!!
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 25th January 2026, 09:39   #64  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
sold !!!
Thanks for the donation, enjoy the Pro version!
VoodooFX is offline   Reply With Quote
Old 25th January 2026, 10:07   #65  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Thanks for the donation, enjoy the Pro version!
OK, successfully downloaded & unpacked.

Turned out to be AU$103.21.

So you have already mentioned how to add this to SE,

Quote:
Just copy it to the same folder where the regular version is in SE. (Delete the old files there, excluding the models)
Are there any other instructions?

Be aware, that I could end up asking quite a few "stupid" questions, until I get the hang of it, so sorry in advance.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 25th January 2026, 10:25   #66  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
Are there any other instructions?

Be aware, that I could end up asking quite a few "stupid" questions, until I get the hang of it, so sorry in advance.
No.
There are no stupid questions, as it's a pretty technically sophisticated app.
VoodooFX is offline   Reply With Quote
Old 26th January 2026, 02:32   #67  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
No.
There are no stupid questions, as it's a pretty technically sophisticated app.
OK, here's my first "question's". Since sending this I have figured out most of my questions

I've just run it for the first time running at your defaults, dragging the audio file to a shortcut on the Desktop, it downloaded the medium model, so now I know where they go, and what the directory looks like

How to copy the models I downloaded for your app, within SE ?? sorted

Hopefully I don't need to download them again, although that would ensure they were the correct ones. don't

To change the transcription commands. is that here :- confirm this, tho

Quote:
:: Start processing
"%dp%faster-whisper-xxl.exe" %file_list% -pp -o source --batch_recursive --check_files --standard -f json srt -m medium
And just to confirm, adding this to SE so it can be run from there (which probably isn't necessary) (see attached screenshot)

https://imgur.com/eyLGzwt

Does it go in Users\Appdata\Roaming, or Program Files.? also sorted

That's all for now.

PS:- It would nice to have SE display that it's now the Pro version.

EDIT:- OK, I have done several passes on a Das Boot audio track, and it's doing a pretty good job, but there might be some more advanced commands that might improve it even more.

It's miss pronouncing a few words, but nothing a little manual editing won't fix.

Can a form of SDH sub's be generated, for example if I use const-me large, it does, to a degree.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660

Last edited by TR-9970X; 26th January 2026 at 03:58.
TR-9970X is offline   Reply With Quote
Old 26th January 2026, 12:53   #68  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
Can a form of SDH sub's be generated, for example if I use const-me large, it does, to a degree.
Forget that const-me, it's a subpar implementation and abandonware.
It can, but to get "SDH" you need to disable some quality settings/safety measures, basically you are asking for hallucinations.

Use, --suppress_tokens="" or --suppress_tokens=None, both should've different behavior at the low level. [I don't remember differences, empty list [""] is not intended behaviour, but there were reports that it was useful for something]

Then you may want to disable VAD audio preprocess, --vad_filter=false, and don't enable other audio preprocess.

Then probably you want to enable -hst=2 to combat hallucination.


Quote:
Originally Posted by TR-9970X View Post
Hopefully I don't need to download them again
You can set --model_dir to the path of a models folder if you don't want it to look in the default location.

Quote:
Originally Posted by TR-9970X View Post
And just to confirm, adding this to SE so it can be run from there (which probably isn't necessary) (see attached screenshot)
That site doesn't work for me. Anyway, about SE, ask at the SE thread.

Last edited by VoodooFX; 26th January 2026 at 13:01.
VoodooFX is offline   Reply With Quote
Old 26th January 2026, 13:16   #69  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Forget that const-me, it's a subpar implementation and abandonware.
It can, but to get "SDH" you need to disable some quality settings/safety measures, basically you are asking for hallucinations.

Use, --suppress_tokens="" or --suppress_tokens=None, both should've different behavior at the low level. [I don't remember differences, empty list [""] is not intended behaviour, but there were reports that it was useful for something]

Then you may want to disable VAD audio preprocess, --vad_filter=false, and don't enable other audio preprocess.

Then probably you want to enable -hst=2 to combat hallucination.

Excellent, I will give that a try tomorrow



You can set --model_dir to the path of a models folder if you don't want it to look in the default location.



That site doesn't work for me. Anyway, about SE, ask at the SE thread.
I've been able to figure out where the models go, and to get Pro working within SE, was pretty easy once I checked out the folder & files within the Subtitle Edit default location.

I am watching Pt 5 of Das Boot, in which I transcribed the audio track with 5 different models, Medium, Large v1, v2 & v3, and each one has slightly different results, so I will be able to go thru and use the best or most correct subtitle for each line, tedious, but accurate.

I'm just being VERY particular with this movie/series.

So with the drag & drop onto the desktop shortcut, can the default command line be changed, like using the Advanced command line used in SE ??

Although I will probably prefer to use SE, it would be nice to customise the drag & drop process.

So now that I have Pro, I can run several models of the transcription, and get the subs spot on, and it's interesting to see how much the GPU is used during the process....

Still got a lot of testing to do, but so far, pretty happy

Cheers
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 26th January 2026, 13:47   #70  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
So with the drag & drop onto the desktop shortcut, can the default command line be changed, like using the Advanced command line used in SE ??
Although I will probably prefer to use SE, it would be nice to customise the drag & drop process.
Of course, the commands can be changed, added, or removed however you like.
Like I said before, be aware that on some files SE doesn't work as expected, the results can be worse then.
VoodooFX is offline   Reply With Quote
Old 27th January 2026, 05:30   #71  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Of course, the commands can be changed, added, or removed however you like.
Like I said before, be aware that on some files SE doesn't work as expected, the results can be worse then.
Doom9 has been down for most of today (my time)

Hi, so I've tried to get some SDH stuff working, but I don't think it's producing what I thought it should..the only thing I think I noticed where a lot of "speech marks" (these things ")

Here's the command I used, it's probably quite wrong :-

Code:
"%dp%faster-whisper-xxl.exe" %file_list% -pp -o source --batch_recursive --check_files --standard --vad_method pyannote_v3 -o source --standard --max_gap 1 -hst 2 -ct float16 --ff_vocal_extract mb-roformer --realign -f srt -m large-v2 --language en -suppress_tokens"" --vad_filter=false
And another question, is there a way to transcribe foreign language parts, into english, or the language of the part ?? (Would you need to know what the language was to start with?)

OR, if you transcribed the video to catch the english parts, then ran a different script to capture the foreign parts ??

Regards.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 27th January 2026, 08:43   #72  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Got an error whilst trying to transcribe a long ac3.

Code:
Audio filtering is in progress...
Estimating duration from bitrate, this may be inaccurate
Estimating duration from bitrate, this may be inaccurate
MB-RoFormer model running on CUDA:   1% | 31/2927 | 11:34<<18:00:48

Traceback (most recent call last):
  File "__main__.py", line 212, in ffmpeg_audio
  File "faster_whisper\roformer_infer.py", line 234, in RoFormer_separator
  File "faster_whisper\roformer_infer.py", line 83, in demix_track
torch.AcceleratorError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Press any key to continue . . .
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 27th January 2026, 08:52   #73  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
I noticed where a lot of "speech marks" (these things ")
I don't know what that means.
Anyway, remove ff_vocal_extract, what "SDH" do you expect when all non-voice is removed from the audio.

And I think, maybe you need to use one from these too: -prompt None or --reprompt false or --prompt_reset_on_no_end 0 [a former will disable the latter args]
EDIT: Or maybe there will be no harm in not disabling prompt_reset_on_no_end.


Quote:
Originally Posted by TR-9970X View Post
And another question, is there a way to transcribe foreign language parts, into english, or the language of the part ?? (Would you need to know what the language was to start with?)
Try --task translate
The model is not meant to transcribe multi-language audio, those measures are workarounds.

Last edited by VoodooFX; 27th January 2026 at 09:20.
VoodooFX is offline   Reply With Quote
Old 27th January 2026, 09:01   #74  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
long
Let's not operate with abstractions. What "long" means? What "speech marks" means?

I guess you run out of RAM/VRAM. Try --roformer_vram 6 or other value.
VoodooFX is offline   Reply With Quote
Old 27th January 2026, 09:06   #75  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
I don't know what that means.
Anyway, remove ff_vocal_extract, what "SDH" do you expect when all non-voice is removed from the audio.
And I think, you need to use one from these too: -prompt None or --reprompt false [the former will disable the latter]

OK, will give it a try

I don't know what to expect, I was just asking.



Try --task translate
The model is not meant to transcribe multi-language audio, those measures are workarounds.
Lots of question's, sorry.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 27th January 2026, 09:10   #76  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Let's not operate with abstractions. What "long" means? What "speech marks" means?

I guess you run out of RAM/VRAM. Try --roformer_vram 6 or other value.
By long, nearly 5 hours, 1.3Gb !!!!

I had never heard of "speech marks" until I heard a guy on YouTube saying it.

I used to always call them inverted comma's.... " "

I'm running a 4080 Super, surely there won't be a VRAM issue.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 27th January 2026, 09:41   #77  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
By long, nearly 5 hours

I'm running a 4080 Super, surely there won't be a VRAM issue.
For example, when running mb-roformer on CPU with 3:26:00 long audio, it eats ~28GB RAM.



Quote:
Originally Posted by TR-9970X View Post
I had never heard of "speech marks" until I heard a guy on YouTube saying it.

I used to always call them inverted comma's.... " "
I don't know what " " abstraction means too. Please spare me from any abstractions.
If you have an issue, please post showing the exact problem, preferably with an audio example to reproduce it.

Last edited by VoodooFX; 27th January 2026 at 09:45.
VoodooFX is offline   Reply With Quote
Old 27th January 2026, 11:37   #78  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
For example, when running mb-roformer on CPU with 3:26:00 long audio, it eats ~28GB RAM.

OK, so can you please provide the command needed to use the CPU instead of the GPU ?



I don't know what " " abstraction means too. Please spare me from any abstractions.
If you have an issue, please post showing the exact problem, preferably with an audio example to reproduce it.
I don't think I've heard this word "abstraction" before.

What I was trying to explain, that when I used one of your commands, it appeared to produce a lot of extra " " throughout the subtitles.

Here's a Google explanation:-

Quote:
Speech marks, also known as quotation marks or inverted commas, are punctuation marks used to indicate direct speech or quotations in writing.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Old 27th January 2026, 13:18   #79  |  Link
VoodooFX
Video damager
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,270
Quote:
Originally Posted by TR-9970X View Post
I don't think I've heard this word "abstraction" before.
Aren't you a native English speaker?
https://en.wikipedia.org/wiki/Abstraction


Quote:
Originally Posted by TR-9970X View Post
What I was trying to explain, that when I used one of your commands, it appeared to produce a lot of extra " " throughout the subtitles.
Here's a Google explanation:
I know what quotes are, still, I've no idea what the issue is. Or there is no issue?

Quote:
Originally Posted by TR-9970X View Post
OK, so can you please provide the command needed to use the CPU instead of the GPU ?
--voc_device cpu
Why you need it when you have CUDA GPU?

Last edited by VoodooFX; 27th January 2026 at 13:22.
VoodooFX is offline   Reply With Quote
Old 28th January 2026, 00:46   #80  |  Link
TR-9970X
Registered User
 
TR-9970X's Avatar
 
Join Date: Jan 2025
Posts: 233
Quote:
Originally Posted by VoodooFX View Post
Aren't you a native English speaker?
https://en.wikipedia.org/wiki/Abstraction

Yes, Australian, but in all my years in the workplace, and my small circle of friends, I can safely say that "abstraction" has NEVER been in any conversation or discussion.



I know what quotes are, still, I've no idea what the issue is. Or there is no issue?

Again, I thought I noticed a lot of extra "'s, when I used one of your commands to attempt SDH, that's all.

--voc_device cpu
Why you need it when you have CUDA GPU?

You just told me that a 3:26:00 long audio can use 28Gb of RAM, so that would require a CPU, as the 4080 "only" has 16Gb
Anyway, another day of discovery & learning.
__________________
Main Systems:-
9970X on Gigabyte TRX50 AERO D
7970X on Asus Pro WS TRX50-Sage WiFi
9950X3D on MSI Carbon X670E
7950X on Gigabyte Aorus Elite B650
i9-13900KF on MSI Tomahawk B660
TR-9970X is offline   Reply With Quote
Reply

Tags
audio, openai, speech, subtitles, text

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:15.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.