Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
![]() |
|
Thread Tools | Search this Thread | Display Modes |
![]() |
#21 | Link | |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
Why did you removed it? Maybe it's present in other folder?
Quote:
Did you check transcription differences? By default r117 runs int8 quantization on GPU, r103 runs float16. [on CPU both use int8] I changed that because few users reported that int8 is more accurate than float16 and that speed is same. Quantization can be set by "--compute_type". EDIT: @Emulgator Could you do tests these 2 short files with "medium": https://we.tl/t-S5gnRvMuQB , with "--compute_type=float16" & "--compute_type=int8" on CUDA and share 4 srt files?
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 18th May 2023 at 05:03. |
|
![]() |
![]() |
![]() |
#22 | Link | ||
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Quote:
Quote:
Ah, well, 8bit vs. 16bit can make all the difference ! I give a .wav 32bit float decode from the DVD .ac3 track and use the large multilingual model only. Comparing the 3 runs from a 25fps-speed-up 1961 musical movie English soundtrack, quick, cockney and other slang talking, interleaved with songs using WinMerge triple comparison: r103 GPU from 04.05.2023 r103 GPU from 17.05.2023 r117 GPU from 17.05.2023 All versions have their uses and guess differently. Which is good for me: a wealth to choose from. Now it is up to the subtitler (me) just to merge the best parts. Will have to talk Nikse into having 3 editor tabs in SubtitleEdit, muhahaha ;-) Downloaded your sample, testing soon.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." Last edited by Emulgator; 18th May 2023 at 12:35. |
||
![]() |
![]() |
![]() |
#23 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Code:
C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>whisper.exe "C:\_PROG\! Subtitle Tools\! Testfile VoodooFX 2023 05 18\test_original.aac" --language en --model "large" --compute_type=float16 Standalone Faster-Whisper r117 running on: CUDA Estimating duration from bitrate, this may be inaccurate 2023-05-18 15:05:00.1132781 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:1671 onnxruntime::python::CreateInferencePybindStateModule] Init provider bridge failed. [00:00.760 --> 00:02.760] Feeling inspired yet? [00:02.760 --> 00:03.760] No. [00:03.760 --> 00:08.890] Thank you. [00:08.890 --> 00:10.890] I thought you said you were hungry. [00:12.890 --> 00:17.890] There's a boat tour of the Trakla Island formations this afternoon. [00:17.890 --> 00:22.890] I was thinking we could go on that and make reservations in town for dinner. [00:23.890 --> 00:25.890] You could try the Chinese place. [00:25.890 --> 00:28.890] I don't think I'd survive another dinner in town. [00:29.890 --> 00:31.890] Even the idea that... [00:31.890 --> 00:33.890] Does anyone think it's a real town? [00:36.890 --> 00:38.890] Why would they have a Chinese place? [00:43.280 --> 00:46.280] Is it okay if I go? I'll meet you on the beach. [00:47.280 --> 00:49.280] Yeah, sure. [01:22.270 --> 01:24.270] Someone's making a statement. [01:24.270 --> 01:26.270] One of the locals, I guess. [01:29.270 --> 01:31.270] What do you think he's trying to say? [01:31.270 --> 01:35.270] He's saying that he wants to put a long knife right through her. [01:35.270 --> 01:40.270] And after you die, he'll hang your body at the airport to scare off the other tourists. [01:42.270 --> 01:44.270] Seems a bit extreme. [01:46.270 --> 01:49.270] The Latokans are a melodramatic people. [01:54.740 --> 01:56.740] I loved your book. [01:58.740 --> 01:59.740] Sorry? [02:00.740 --> 02:03.740] You're James Foster. I loved your book. [02:06.740 --> 02:09.740] Sorry, is that good? I don't mean to put you in the spot. [02:09.740 --> 02:11.740] No, thank you. [02:11.740 --> 02:13.740] It's just, um... [02:13.740 --> 02:15.740] Not a lot of people read my book. [02:15.740 --> 02:17.740] I'm Gabby Bauer. [02:17.740 --> 02:19.740] I'm James Foster. [02:21.740 --> 02:22.740] Alvin! [02:25.520 --> 02:27.520] This is James Foster. [02:27.520 --> 02:29.520] Hi, nice to meet you. Albon Bauer. [02:29.520 --> 02:30.520] Pleasure. [02:30.520 --> 02:32.520] He wrote your book that I love, The Variable Sheath. [02:32.520 --> 02:34.520] Oh, yeah, I remember. [02:34.520 --> 02:36.520] I thought it was brilliant. [02:36.520 --> 02:37.520] Yes. [02:37.520 --> 02:41.520] James, do you think I could convince you to join us for dinner this evening? [02:42.520 --> 02:46.520] I've been seeing you around the resort for a few days now and I would love to get to know you. [02:46.520 --> 02:49.520] We have a reservation tonight at Yang's. [03:00.450 --> 03:02.450] Yeah, it was a good... [03:04.450 --> 03:06.450] ...learning experience. [03:06.450 --> 03:07.450] All right. [03:07.450 --> 03:10.450] Is there anything else I can get you? [03:10.450 --> 03:12.450] Um, that's all I think. [03:12.450 --> 03:14.450] All right, everyone, please have a great meal. [03:14.450 --> 03:15.450] Thank you. [03:15.450 --> 03:19.450] And let me know any time if I can make your experience even more enjoyable. [03:22.450 --> 03:24.450] He's an interesting guy. [03:24.450 --> 03:25.450] Yes. [03:25.450 --> 03:29.450] This resort is labelled in the resort guide as a multicultural dining experience. [03:30.450 --> 03:32.450] Well, it certainly is an experience. [03:33.450 --> 03:36.450] So, Albon, what is it you do for a living? [03:36.450 --> 03:39.450] Oh, architecture. But I'm mostly retired. [03:39.450 --> 03:42.450] Now I run a journal out of Los Angeles called Glass Pane. [03:42.450 --> 03:43.450] You're French? [03:43.450 --> 03:46.450] Oh, no. Swiss first, from Geneva. [03:46.450 --> 03:48.450] Then Paris, then LA. [03:49.450 --> 03:52.450] I'm from London first. Then Paris. [03:52.450 --> 03:53.450] We met there. [03:53.450 --> 03:54.450] That's how we met. [03:54.450 --> 03:57.450] But I couldn't get work there, so I made Albon move with me. [03:58.450 --> 04:00.450] And what do you do? [04:00.450 --> 04:03.450] Well, I'm an actress, of course. [04:03.450 --> 04:05.450] Oh, really? She's great. [04:06.450 --> 04:07.450] For commercials. [04:07.450 --> 04:09.450] I have a contract with an LA company. [04:09.450 --> 04:11.450] They've been grooming me. [04:11.450 --> 04:13.450] I specialize in failing naturally. [04:14.450 --> 04:17.450] What does that mean? Failing naturally? [04:18.450 --> 04:22.450] Finding a natural-seeming way to fail at any given task. [04:22.450 --> 04:24.450] In each of the commercials that I'm in, [04:24.450 --> 04:27.450] I'm the one who simply can't go on without the product. [04:27.450 --> 04:29.450] It's ridiculous for me not to have the product. [04:30.450 --> 04:31.450] Okay. [04:31.450 --> 04:32.450] Show them. [04:32.450 --> 04:33.450] No. [04:33.450 --> 04:34.450] No, you should. [04:34.450 --> 04:35.450] Yeah. [04:35.450 --> 04:36.450] Please. [04:36.450 --> 04:37.450] Do you want to see? [04:37.450 --> 04:38.450] I want to see. [04:38.450 --> 04:39.450] Here. [04:42.450 --> 04:43.450] She's amazing. [04:56.660 --> 05:02.450] I just... [05:04.450 --> 05:05.450] I... Standalone Faster-Whisper operation finished in: 25 seconds C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>pause Drücken Sie eine beliebige Taste . . .
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
![]() |
![]() |
![]() |
#24 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Code:
C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>whisper.exe "C:\_PROG\! Subtitle Tools\! Testfile VoodooFX 2023 05 18\test_original.aac" --language en --model "large" --compute_type=int8 Standalone Faster-Whisper r117 running on: CUDA Estimating duration from bitrate, this may be inaccurate 2023-05-18 15:08:01.9382416 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:1671 onnxruntime::python::CreateInferencePybindStateModule] Init provider bridge failed. [00:00.760 --> 00:02.760] Feeling inspired yet? [00:02.760 --> 00:08.890] No, thank you. [00:08.890 --> 00:10.890] I thought you said you were hungry. [00:12.890 --> 00:17.890] There's a boat tour of the Trakla Island formations this afternoon. [00:17.890 --> 00:22.890] I was thinking we could go on that and make reservations in town for dinner. [00:23.890 --> 00:25.890] You could try the Chinese place. [00:25.890 --> 00:28.890] I don't think I'd survive another dinner in town. [00:29.890 --> 00:31.890] Even the idea that... [00:31.890 --> 00:33.890] Does anyone think it's a real town? [00:35.890 --> 00:38.890] Why would they have a Chinese place? [00:43.280 --> 00:46.280] Is it okay if I go? I'll meet you on the beach. [00:47.280 --> 00:49.280] Yeah, sure. [01:22.270 --> 01:24.270] Someone's making a statement. [01:24.270 --> 01:27.270] One of the locals, I guess. [01:29.270 --> 01:31.270] What do you think he's trying to say? [01:31.270 --> 01:35.270] He's saying that he wants to put a long knife right through her. [01:35.270 --> 01:40.270] And after you die, he'll hang your body at the airport to scare off the other tourists. [01:42.270 --> 01:44.270] Seems a bit extreme. [01:46.270 --> 01:49.270] The Latokans are a melodramatic people. [01:54.740 --> 01:56.740] I loved your book. [01:58.740 --> 01:59.740] Sorry? [02:00.740 --> 02:03.740] You're James Foster. I loved your book. [02:06.740 --> 02:09.740] Sorry, is that good? I don't mean to put you in the spot. [02:09.740 --> 02:11.740] No, thank you. [02:11.740 --> 02:13.740] It's just, um... [02:13.740 --> 02:15.740] Not a lot of people read my book. [02:15.740 --> 02:17.740] I'm Gabby Bauer. [02:17.740 --> 02:19.740] I'm James Foster. [02:21.740 --> 02:22.740] Alvin! [02:25.520 --> 02:27.520] This is James Foster. [02:27.520 --> 02:29.520] Hi, nice to meet you. Albon Bauer. [02:29.520 --> 02:30.520] Pleasure. [02:30.520 --> 02:32.520] He wrote your book that I love, The Variable Sheath. [02:32.520 --> 02:34.520] Oh, yeah, I remember. [02:34.520 --> 02:36.520] I thought it was brilliant. [02:36.520 --> 02:37.520] Yes. [02:37.520 --> 02:41.520] James, do you think I could convince you to join us for dinner this evening? [02:42.520 --> 02:46.520] I've been seeing you around the resort for a few days now and I would love to get to know you. [02:46.520 --> 02:49.520] We have a reservation tonight at Yang's. [03:00.450 --> 03:02.450] Yeah, it was a good... [03:04.450 --> 03:06.450] ...learning experience. [03:06.450 --> 03:07.450] All right. [03:07.450 --> 03:10.450] Is there anything else I can get you? [03:10.450 --> 03:12.450] Um, that's all I think. [03:12.450 --> 03:14.450] All right, everyone, please have a great meal. [03:14.450 --> 03:15.450] Thank you. [03:15.450 --> 03:19.450] And let me know any time if I can make your experience even more enjoyable. [03:22.450 --> 03:24.450] He's an interesting guy. [03:24.450 --> 03:25.450] Yes. [03:25.450 --> 03:29.450] This resort is labelled in the resort guide as a multicultural dining experience. [03:30.450 --> 03:32.450] Well, it certainly is an experience. [03:33.450 --> 03:36.450] So, Albon, what is it you do for a living? [03:36.450 --> 03:39.450] Oh, architecture. But I'm mostly retired. [03:39.450 --> 03:42.450] Now I run a journal out of Los Angeles called Glass Pane. [03:42.450 --> 03:43.450] You're French? [03:43.450 --> 03:46.450] Oh, no. Swiss first, from Geneva. [03:46.450 --> 03:48.450] Then Paris, then L.A. [03:49.450 --> 03:52.450] I'm from London first. Then Paris. [03:52.450 --> 03:53.450] We met there. [03:53.450 --> 03:54.450] That's how we met. [03:54.450 --> 03:57.450] But I couldn't get work there, so I made Albon move with me. [03:58.450 --> 04:00.450] And what do you do? [04:00.450 --> 04:03.450] Well, I'm an actress, of course. [04:03.450 --> 04:05.450] Oh, really? She's great. [04:06.450 --> 04:07.450] For commercials. [04:07.450 --> 04:09.450] I have a contract with an L.A. company. [04:09.450 --> 04:11.450] They've been grooming me. [04:11.450 --> 04:13.450] I specialize in failing naturally. [04:14.450 --> 04:17.450] What does that mean? Failing naturally? [04:18.450 --> 04:22.450] Finding a natural-seeming way to fail at any given task. [04:22.450 --> 04:24.450] In each of the commercials that I'm in, [04:24.450 --> 04:27.450] I'm the one who simply can't go on without the product. [04:27.450 --> 04:29.450] It's ridiculous for me not to have the product. [04:30.450 --> 04:31.450] Okay. [04:31.450 --> 04:32.450] Show them. [04:32.450 --> 04:33.450] No. [04:33.450 --> 04:34.450] No, you should. [04:34.450 --> 04:35.450] Yeah. [04:35.450 --> 04:36.450] Please. [04:36.450 --> 04:37.450] Do you want to see? [04:37.450 --> 04:38.450] I want to see. [04:38.450 --> 04:39.450] Here. [04:42.450 --> 04:43.450] She's amazing. [04:56.660 --> 05:02.450] I just... [05:04.450 --> 05:05.450] I... Standalone Faster-Whisper operation finished in: 38 seconds C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>pause Drücken Sie eine beliebige Taste . . .
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
![]() |
![]() |
![]() |
#25 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Code:
C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>whisper.exe "C:\_PROG\! Subtitle Tools\! Testfile VoodooFX 2023 05 18\test_ffmpeg6.wav" --language en --model "large" --compute_type=float16 Standalone Faster-Whisper r117 running on: CUDA 2023-05-18 15:09:37.3092070 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:1671 onnxruntime::python::CreateInferencePybindStateModule] Init provider bridge failed. [00:00.760 --> 00:02.760] Feeling inspired yet? [00:02.760 --> 00:03.760] No. [00:03.760 --> 00:08.890] Thank you. [00:08.890 --> 00:10.890] I thought you said you were hungry. [00:12.890 --> 00:17.890] There's a boat tour of the Trakla Island formations this afternoon. [00:17.890 --> 00:22.890] I was thinking we could go on that and make reservations in town for dinner. [00:23.890 --> 00:25.890] You could try the Chinese place. [00:25.890 --> 00:28.890] I don't think I'd survive another dinner in town. [00:29.890 --> 00:31.890] Even the idea that... [00:31.890 --> 00:33.890] Does anyone think it's a real town? [00:36.890 --> 00:38.890] Why would they have a Chinese place? [00:43.280 --> 00:46.280] Is it okay if I go? I'll meet you on the beach. [00:47.280 --> 00:49.280] Yeah, sure. [01:22.270 --> 01:24.270] Someone's making a statement. [01:24.270 --> 01:26.270] One of the locals, I guess. [01:29.270 --> 01:31.270] What do you think he's trying to say? [01:31.270 --> 01:35.270] He's saying that he wants to put a long knife right through her. [01:35.270 --> 01:40.270] And after you die, he'll hang your body at the airport to scare off the other tourists. [01:42.270 --> 01:44.270] Seems a bit extreme. [01:46.270 --> 01:49.270] The Latokans are a melodramatic people. [01:54.740 --> 01:56.740] I loved your book. [01:58.740 --> 01:59.740] Sorry? [02:00.740 --> 02:03.740] You're James Foster. I loved your book. [02:06.740 --> 02:09.740] Sorry, is that good? I don't mean to put you in the spot. [02:09.740 --> 02:11.740] No, thank you. [02:11.740 --> 02:13.740] It's just, um... [02:13.740 --> 02:15.740] Not a lot of people read my book. [02:15.740 --> 02:17.740] I'm Gabby Bauer. [02:17.740 --> 02:19.740] I'm James Foster. [02:21.740 --> 02:22.740] Alvin! [02:25.520 --> 02:27.520] This is James Foster. [02:27.520 --> 02:29.520] Hi, nice to meet you. Albon Bauer. [02:29.520 --> 02:30.520] Pleasure. [02:30.520 --> 02:32.520] He wrote your book that I love, The Variable Sheath. [02:32.520 --> 02:34.520] Oh, yeah, I remember. [02:34.520 --> 02:36.520] I thought it was brilliant. [02:36.520 --> 02:37.520] Yes. [02:37.520 --> 02:41.520] James, do you think I could convince you to join us for dinner this evening? [02:42.520 --> 02:46.520] I've been seeing you around the resort for a few days now and I would love to get to know you. [02:46.520 --> 02:49.520] We have a reservation tonight at Yang's. [03:00.450 --> 03:02.450] Yeah, it was a good... [03:04.450 --> 03:06.450] ...learning experience. [03:06.450 --> 03:07.450] All right. [03:07.450 --> 03:10.450] Is there anything else I can get you? [03:10.450 --> 03:12.450] Um, that's all I think. [03:12.450 --> 03:14.450] All right, everyone, please have a great meal. [03:14.450 --> 03:15.450] Thank you. [03:15.450 --> 03:19.450] And let me know any time if I can make your experience even more enjoyable. [03:22.450 --> 03:24.450] He's an interesting guy. [03:24.450 --> 03:25.450] Yes. [03:25.450 --> 03:29.450] This resort is labelled in the resort guide as a multicultural dining experience. [03:30.450 --> 03:32.450] Well, it certainly is an experience. [03:33.450 --> 03:36.450] So, Albon, what is it you do for a living? [03:36.450 --> 03:39.450] Oh, architecture. But I'm mostly retired. [03:39.450 --> 03:42.450] Now I run a journal out of Los Angeles called Glass Pane. [03:42.450 --> 03:43.450] You're French? [03:43.450 --> 03:46.450] Oh, no. Swiss first, from Geneva. [03:46.450 --> 03:48.450] Then Paris, then LA. [03:49.450 --> 03:52.450] I'm from London first. Then Paris. [03:52.450 --> 03:53.450] We met there. [03:53.450 --> 03:54.450] That's how we met. [03:54.450 --> 03:57.450] But I couldn't get work there, so I made Albon move with me. [03:58.450 --> 04:00.450] And what do you do? [04:00.450 --> 04:03.450] Well, I'm an actress, of course. [04:03.450 --> 04:05.450] Oh, really? She's great. [04:06.450 --> 04:07.450] For commercials. [04:07.450 --> 04:09.450] I have a contract with an LA company. [04:09.450 --> 04:11.450] They've been grooming me. [04:11.450 --> 04:13.450] I specialize in failing naturally. [04:14.450 --> 04:17.450] What does that mean? Failing naturally? [04:18.450 --> 04:22.450] Finding a natural-seeming way to fail at any given task. [04:22.450 --> 04:24.450] In each of the commercials that I'm in, [04:24.450 --> 04:27.450] I'm the one who simply can't go on without the product. [04:27.450 --> 04:29.450] It's ridiculous for me not to have the product. [04:30.450 --> 04:31.450] Okay. [04:31.450 --> 04:32.450] Show them. [04:32.450 --> 04:33.450] No. [04:33.450 --> 04:34.450] No, you should. [04:34.450 --> 04:35.450] Yeah. [04:35.450 --> 04:36.450] Please. [04:36.450 --> 04:37.450] Do you want to see? [04:37.450 --> 04:38.450] I want to see. [04:38.450 --> 04:39.450] Here. [04:42.450 --> 04:43.450] She's amazing. [04:56.660 --> 05:02.450] I just... [05:04.450 --> 05:05.450] I... Standalone Faster-Whisper operation finished in: 21 seconds C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>pause Drücken Sie eine beliebige Taste . . .
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
![]() |
![]() |
![]() |
#26 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Code:
C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>whisper.exe "C:\_PROG\! Subtitle Tools\! Testfile VoodooFX 2023 05 18\test_ffmpeg6.wav" --language en --model "large" --compute_type=int8 Standalone Faster-Whisper r117 running on: CUDA 2023-05-18 15:11:27.5628509 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:1671 onnxruntime::python::CreateInferencePybindStateModule] Init provider bridge failed. [00:00.760 --> 00:02.760] Feeling inspired yet? [00:02.760 --> 00:08.890] No, thank you. [00:08.890 --> 00:10.890] I thought you said you were hungry. [00:12.890 --> 00:17.890] There's a boat tour of the Trakla Island formations this afternoon. [00:17.890 --> 00:22.890] I was thinking we could go on that and make reservations in town for dinner. [00:23.890 --> 00:25.890] You could try the Chinese place. [00:25.890 --> 00:28.890] I don't think I'd survive another dinner in town. [00:29.890 --> 00:31.890] Even the idea that... [00:31.890 --> 00:33.890] Does anyone think it's a real town? [00:36.890 --> 00:38.890] Why would they have a Chinese place? [00:43.280 --> 00:46.280] Is it okay if I go? I'll meet you on the beach. [00:47.280 --> 00:49.280] Yeah, sure. [01:22.270 --> 01:24.270] Someone's making a statement. [01:24.270 --> 01:26.270] One of the locals, I guess. [01:29.270 --> 01:31.270] What do you think he's trying to say? [01:31.270 --> 01:35.270] He's saying that he wants to put a long knife right through her. [01:35.270 --> 01:40.270] And after you die, he'll hang your body at the airport to scare off the other tourists. [01:42.270 --> 01:44.270] Seems a bit extreme. [01:46.270 --> 01:49.270] The Latokans are a melodramatic people. [01:54.740 --> 01:56.740] I loved your book. [01:58.740 --> 01:59.740] Sorry? [02:00.740 --> 02:03.740] You're James Foster. I loved your book. [02:06.740 --> 02:09.740] Sorry, is that good? I don't mean to put you in the spot. [02:09.740 --> 02:11.740] No, thank you. [02:11.740 --> 02:13.740] It's just, um... [02:13.740 --> 02:15.740] Not a lot of people read my book. [02:15.740 --> 02:17.740] I'm Gabby Bauer. [02:17.740 --> 02:19.740] I'm James Foster. [02:21.740 --> 02:22.740] Alvin! [02:25.520 --> 02:27.520] This is James Foster. [02:27.520 --> 02:29.520] Hi, nice to meet you. Albon Bauer. [02:29.520 --> 02:30.520] Pleasure. [02:30.520 --> 02:32.520] He wrote your book that I love, The Variable Sheath. [02:32.520 --> 02:34.520] Oh, yeah, I remember. [02:34.520 --> 02:36.520] I thought it was brilliant. [02:36.520 --> 02:37.520] Yes. [02:37.520 --> 02:42.520] James, do you think I could convince you to join us for dinner this evening? [02:42.520 --> 02:46.520] I've been seeing you around the resort for a few days now and I would love to get to know you. [02:46.520 --> 02:49.520] We have a reservation tonight at Yang's. [03:00.450 --> 03:02.450] Yeah, it was a good... [03:04.450 --> 03:06.450] learning experience. [03:06.450 --> 03:07.450] All right. [03:07.450 --> 03:10.450] Is there anything else I can get you? [03:10.450 --> 03:12.450] Um, that's all I think. [03:12.450 --> 03:14.450] All right, everyone, please have a great meal. [03:14.450 --> 03:15.450] Thank you. [03:15.450 --> 03:20.450] And let me know any time if I can make your experience even more enjoyable. [03:22.450 --> 03:24.450] He's an interesting guy. [03:24.450 --> 03:25.450] Yes. [03:25.450 --> 03:30.450] This resort is labeled in the resort guide as a multicultural dining experience. [03:30.450 --> 03:33.450] Well, it certainly is an experience. [03:33.450 --> 03:36.450] So, Albon, what is it you do for a living? [03:36.450 --> 03:39.450] Oh, architecture. But I'm mostly retired. [03:39.450 --> 03:42.450] Now I run a journal out of Los Angeles called Glass Pane. [03:42.450 --> 03:43.450] You're French? [03:43.450 --> 03:46.450] Oh, no. Swiss first, from Geneva. [03:46.450 --> 03:48.450] Then Paris, then LA. [03:49.450 --> 03:52.450] I'm from London first. Then Paris. [03:52.450 --> 03:53.450] We met there. [03:53.450 --> 03:54.450] That's how we met. [03:54.450 --> 03:57.450] But I couldn't get work there, so I made Albon move with me. [03:58.450 --> 04:00.450] And what do you do? [04:00.450 --> 04:03.450] Well, I'm an actress, of course. [04:03.450 --> 04:05.450] Oh, really? She's great. [04:06.450 --> 04:07.450] For commercials. [04:07.450 --> 04:09.450] I have a contract with an LA company. [04:09.450 --> 04:11.450] They've been grooming me. [04:11.450 --> 04:13.450] I specialize in failing naturally. [04:14.450 --> 04:17.450] What does that mean? Failing naturally? [04:17.450 --> 04:22.450] Finding a natural-seeming way to fail at any given task. [04:22.450 --> 04:24.450] In each of the commercials that I'm in, [04:24.450 --> 04:27.450] I'm the one who simply can't go on without the product. [04:27.450 --> 04:29.450] It's ridiculous for me not to have the product. [04:30.450 --> 04:31.450] Okay. [04:31.450 --> 04:32.450] Show them. [04:32.450 --> 04:33.450] No. [04:33.450 --> 04:34.450] No, you should. [04:34.450 --> 04:35.450] Yeah. [04:35.450 --> 04:36.450] Please. [04:36.450 --> 04:37.450] Do you want to see? [04:37.450 --> 04:38.450] I want to see. [04:38.450 --> 04:39.450] Here. [04:42.450 --> 04:43.450] She's amazing. [04:56.660 --> 05:02.450] I just... [05:04.450 --> 05:05.450] I... Standalone Faster-Whisper operation finished in: 38 seconds C:\_PROG\! Subtitle Tools\Whisper-Faster_Win.x64_2023.05.13.b117_GPU>pause Drücken Sie eine beliebige Taste . . .
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
![]() |
![]() |
![]() |
#27 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Tiny differences, float16 was quicker then int8, ffmpeg6.wav float16 was quickest.
.aac was slower. As I thought: Precision pays off ? After all it is about cross-comparing spectrograms, and tiny losses in density differences can lead to costlier because more exhausting searches. Alle these on model large, only this was available on that system for now, and I did not want to let that one go into internet again after being bluescreened twice by the last 2 forced M$ Win10 updates, the last time leaving me with unrepairable system. I was not aware that M$ had decided the unspeakable from W10 r1803 on: NOT to perform any registry backups anymore by default... To save HDD space. WTF? https://learn.microsoft.com/en-us/tr...regback-folder NOT to perform any system restore points anymore by default... Even deleting manually made ones. WTF? https://answers.microsoft.com/en-us/...e-b605ea095ab1 https://answers.microsoft.com/en-us/...9-f6fd51184185 https://learn.microsoft.com/en-us/tr...oints-disabled
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." Last edited by Emulgator; 18th May 2023 at 14:51. |
![]() |
![]() |
![]() |
#28 | Link | ||
Banana User
Join Date: Sep 2008
Posts: 1,121
|
I need and asked for medium model and srt files. [uploaded somewhere like Wetransfer]
![]() But I'll check these large tests too. [saved, so those posts are not needed anymore] There are many other quantization types, run --verbose to see all supported on your device. Don't. Use original audio. Quote:
Quote:
Benchmarks on short files doesn't mean much. Did you meant compute types? I'm not sure how they correlate to accuracy or speed. So far for me int8 looks best when float32 is fastest. Some users reported opposite effects. EDIT: Or did you meant something with audio? That "wav" test file is only to check some quirks with FFmpeg v6. For some reason results from v6 can be worse or different, it affects int types.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 18th May 2023 at 15:35. |
||
![]() |
![]() |
![]() |
#29 | Link | ||
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Quote:
Sorry for the ambiguity. Quote:
and I want to be in control about the decoding precision.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." Last edited by Emulgator; 18th May 2023 at 15:42. |
||
![]() |
![]() |
![]() |
#30 | Link |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
@Emulgator Could you do few more tests on aac with CUDA: "--language en --model=large --compute_type=float32" and "--language en --model=medium --compute_type=float16"?
[Results in the same form like you did previous tests.] Btw, for your own tests you can try "--beam_size=5", it's slower but should produce better results.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 20th May 2023 at 14:05. |
![]() |
![]() |
![]() |
#31 | Link |
Big Bit Savings Now !
Join Date: Feb 2007
Location: close to the wall
Posts: 1,894
|
Soon (...still trying to get my main system up and running as before)
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain) "Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..." |
![]() |
![]() |
![]() |
#32 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 11,111
|
These are basically the two that I've tried [only on a few occasions, maybe 5 or 6],
Code:
Whisper-Faster\whisper.exe --model_dir ".\_models" --language en --model "large-v2" ".\audio.wav" Whisper-Faster\whisper.exe --model_dir ".\_models" --language en --model "large-v2" ".\audio.dts" Its weird how some subs are flagged <during non talkative periods> maybe up to a minute ahead of the actual start of speech, and stop pretty much at end of speech. Also, Quote:
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 13th July 2023 at 23:08. |
|
![]() |
![]() |
![]() |
#33 | Link | |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
Quote:
model_dir parameter is redundant in your example, at least in latest version. Probably in latest version you'll not see that. Not odd, Whisper models doesn't support transcription of multilingual audio. You can try to process it twice, first with English then with Spanish parameter.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
![]() |
![]() |
![]() |
#35 | Link |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
Yes, it's enabled by default. It includes all those things.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
![]() |
![]() |
![]() |
#36 | Link |
Registered User
Join Date: Feb 2017
Posts: 149
|
But the examples shown in the OP screenshot and by Emulgator do not show this. They're all specific second-based intervals that are seemingly locked into a particular fraction-of-a-second start point.
Last edited by SaurusX; 14th July 2023 at 16:03. |
![]() |
![]() |
![]() |
#37 | Link | |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
Quote:
Post your screenshot of what your "original Whisper" shows.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling |
|
![]() |
![]() |
![]() |
#38 | Link | |
Registered User
Join Date: Feb 2017
Posts: 149
|
Quote:
https://github.com/openai/whisper When using their CLI I add "--word_timestamps True" and the timing of each sentence or segment is more precise. To the fraction of a second usually, though it can hiccup. I'll add some screenshots later today when I get to my computer. |
|
![]() |
![]() |
![]() |
#39 | Link |
Banana User
Join Date: Sep 2008
Posts: 1,121
|
Yeap, it includes that. In the first post is the old screenshot.
__________________
InpaintDelogo, DoomDelogo, JerkyWEB Fixer, Standalone Faster-Whisper - AI subtitling Last edited by VoodooFX; 14th July 2023 at 19:58. |
![]() |
![]() |
![]() |
#40 | Link |
Registered User
Join Date: Feb 2017
Posts: 149
|
I was getting a dll error saying that I was missing "cudnn_ops_infer64_8.dll" and to put it into my system path. I downloaded it from this zip and dropped into by CUDA bin folder.
https://developer.download.nvidia.co.../cudnn/v8.3.0/ The word_timestamps is working as you said it would be. Doing other tests now with the different model sizes. ![]() OK, that's fast. Using the large-v2 model! ![]() Using the medium.en model. Last edited by SaurusX; 15th July 2023 at 00:38. |
![]() |
![]() |
![]() |
Tags |
audio, openai, speech, subtitles, text |
Thread Tools | Search this Thread |
Display Modes | |
|
|