Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se |
|
|
#61 | Link |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Hi Voodoo,
Well, I have been wasting my time trying to setup various Whisper versions, and even after following the YouTube instructions as best I can, the end result is pretty much the same, they don't work ![]() And it's hard to get support when a lot of the clips are 12 months old, or older.... So I think that you have proven to me what your build can do, I should just "bite the bullet", and donate to you, to get the Pro version. I'm pretty sure you will provide good support, if I have any issues, or need some experienced assistance. So I just want to confirm, that the minimum "donation" to be eligible for Pro is £50, is that correct ?? Regards
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
#62 | Link | ||
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Quote:
Quote:
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
||
|
|
|
|
|
#64 | Link |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
|
|
|
|
#65 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
OK, successfully downloaded & unpacked.
Turned out to be AU$103.21. So you have already mentioned how to add this to SE, Quote:
Be aware, that I could end up asking quite a few "stupid" questions, until I get the hang of it, so sorry in advance.
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
|
#66 | Link | |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Quote:
There are no stupid questions, as it's a pretty technically sophisticated app.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
|
|
|
|
|
#67 | Link | ||
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
![]() I've just run it for the first time running at your defaults, dragging the audio file to a shortcut on the Desktop, it downloaded the medium model, so now I know where they go, and what the directory looks like ![]() How to copy the models I downloaded for your app, within SE ?? sorted Hopefully I don't need to download them again, although that would ensure they were the correct ones. don't To change the transcription commands. is that here :- confirm this, tho ![]() Quote:
https://imgur.com/eyLGzwt Does it go in Users\Appdata\Roaming, or Program Files.? also sorted That's all for now. ![]() PS:- It would nice to have SE display that it's now the Pro version. EDIT:- OK, I have done several passes on a Das Boot audio track, and it's doing a pretty good job, but there might be some more advanced commands that might improve it even more. It's miss pronouncing a few words, but nothing a little manual editing won't fix. Can a form of SDH sub's be generated, for example if I use const-me large, it does, to a degree.
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 Last edited by TR-9970X; 26th January 2026 at 03:58. |
||
|
|
|
|
|
#68 | Link | |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Quote:
It can, but to get "SDH" you need to disable some quality settings/safety measures, basically you are asking for hallucinations. Use, --suppress_tokens="" or --suppress_tokens=None, both should've different behavior at the low level. [I don't remember differences, empty list [""] is not intended behaviour, but there were reports that it was useful for something] Then you may want to disable VAD audio preprocess, --vad_filter=false, and don't enable other audio preprocess. Then probably you want to enable -hst=2 to combat hallucination. You can set --model_dir to the path of a models folder if you don't want it to look in the default location. That site doesn't work for me. Anyway, about SE, ask at the SE thread.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 26th January 2026 at 13:01. |
|
|
|
|
|
|
#69 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
I am watching Pt 5 of Das Boot, in which I transcribed the audio track with 5 different models, Medium, Large v1, v2 & v3, and each one has slightly different results, so I will be able to go thru and use the best or most correct subtitle for each line, tedious, but accurate. I'm just being VERY particular with this movie/series. So with the drag & drop onto the desktop shortcut, can the default command line be changed, like using the Advanced command line used in SE ?? Although I will probably prefer to use SE, it would be nice to customise the drag & drop process. So now that I have Pro, I can run several models of the transcription, and get the subs spot on, and it's interesting to see how much the GPU is used during the process.... Still got a lot of testing to do, but so far, pretty happy ![]() Cheers
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
|
#70 | Link | |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Quote:
Like I said before, be aware that on some files SE doesn't work as expected, the results can be worse then.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
|
|
|
|
|
#71 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
Hi, so I've tried to get some SDH stuff working, but I don't think it's producing what I thought it should..the only thing I think I noticed where a lot of "speech marks" (these things ") Here's the command I used, it's probably quite wrong :- Code:
"%dp%faster-whisper-xxl.exe" %file_list% -pp -o source --batch_recursive --check_files --standard --vad_method pyannote_v3 -o source --standard --max_gap 1 -hst 2 -ct float16 --ff_vocal_extract mb-roformer --realign -f srt -m large-v2 --language en -suppress_tokens"" --vad_filter=false OR, if you transcribed the video to catch the english parts, then ran a different script to capture the foreign parts ?? Regards.
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
|
#72 | Link |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Got an error whilst trying to transcribe a long ac3.
Code:
Audio filtering is in progress... Estimating duration from bitrate, this may be inaccurate Estimating duration from bitrate, this may be inaccurate MB-RoFormer model running on CUDA: 1% | 31/2927 | 11:34<<18:00:48 Traceback (most recent call last): File "__main__.py", line 212, in ffmpeg_audio File "faster_whisper\roformer_infer.py", line 234, in RoFormer_separator File "faster_whisper\roformer_infer.py", line 83, in demix_track torch.AcceleratorError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Press any key to continue . . .
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
#73 | Link | |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
I don't know what that means.
Anyway, remove ff_vocal_extract, what "SDH" do you expect when all non-voice is removed from the audio. ![]() And I think, maybe you need to use one from these too: -prompt None or --reprompt false or --prompt_reset_on_no_end 0 [a former will disable the latter args] EDIT: Or maybe there will be no harm in not disabling prompt_reset_on_no_end. Quote:
The model is not meant to transcribe multi-language audio, those measures are workarounds.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 27th January 2026 at 09:20. |
|
|
|
|
|
|
#74 | Link |
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Let's not operate with abstractions. What "long" means? What "speech marks" means?
I guess you run out of RAM/VRAM. Try --roformer_vram 6 or other value.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
|
|
|
|
#75 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
|
#76 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
I had never heard of "speech marks" until I heard a guy on YouTube saying it. I used to always call them inverted comma's.... " " I'm running a 4080 Super, surely there won't be a VRAM issue.
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
|
|
#77 | Link | ||
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Quote:
Quote:
![]() If you have an issue, please post showing the exact problem, preferably with an audio example to reproduce it.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 27th January 2026 at 09:45. |
||
|
|
|
|
|
#78 | Link | ||
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
What I was trying to explain, that when I used one of your commands, it appeared to produce a lot of extra " " throughout the subtitles. Here's a Google explanation:- Quote:
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
||
|
|
|
|
|
#79 | Link | ||
|
Video damager
Join Date: Sep 2008
Posts: 1,270
|
Aren't you a native English speaker?
https://en.wikipedia.org/wiki/Abstraction Quote:
Quote:
Why you need it when you have CUDA GPU?
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 27th January 2026 at 13:22. |
||
|
|
|
|
|
#80 | Link | |
|
Registered User
Join Date: Jan 2025
Posts: 233
|
Quote:
__________________
Main Systems:- 9970X on Gigabyte TRX50 AERO D 7970X on Asus Pro WS TRX50-Sage WiFi 9950X3D on MSI Carbon X670E 7950X on Gigabyte Aorus Elite B650 i9-13900KF on MSI Tomahawk B660 |
|
|
|
|
![]() |
| Tags |
| audio, openai, speech, subtitles, text |
| Thread Tools | |
| Display Modes | |
|
|