Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 20th March 2014, 02:25   #1  |  Link
Smite
Registered User
 
Join Date: Mar 2014
Posts: 1
Add a "Who's talking" display to video

I've been attempting this on my own but haven't gotten anything to work out and I haven't seen a discussion on it to help point me in the right direction. I'm looking for a way to visually identify when noise comes from audio clips. I'd appreciate any advice anyone might have. If you've ever seen a mumble or skype HUD displaying this sort of info. while in a large call, that is essentially what I'm looking to achieve.

background: I put together a lot of video game footage of multiple players in a grid format, the audio is their commentary. (example 4-person layout) The audio is all individual files until I mix them together, which I think is necessary for what I want to achieve here.

So I've been trying to find a way with avisynth to do a simple effect for when an individual is talking, such as temporarily changing the color of their name or displaying a speaker icon. This would be beneficial because I generally create groups of 4-8 and it would be useful to see who is speaking visually.

I figure that if I can find a way to check an audio clip for dramatic changes in volume, I can possibly apply these periodic effects. Is there a way? I've tooled around with avisynth for a while now but I think what I'm looking to do is beyond me to work out.
Smite is offline   Reply With Quote
Old 20th March 2014, 02:45   #2  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,901
That sounds hard to do in Avisynth. If I had to accomplish this to save my life, I would first preprocess the separate audio files to create a "talking map". Then I would write an avisynth filter to use the talking map to overlay a sprite(s) on the video in the appropriate position(s).
Guest is offline   Reply With Quote
Old 20th March 2014, 03:24   #3  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,345
Is it noise/speech vs silence ?

One way to do it is with conditionalfilter() and minmaxaudio.dll using AudioRMS()

You need 4 invididual sets of audio & video , and you can combine them after with stackhorizontal/stackvertical

The replacement "layer" can be anything, you can overlay a logo, make a pointer, overlay a border like skype, change the color whenever audio is detected above a threshold value

In this example , I used the alternating audio on/off from colorbars() channel 2 as the audio source, and the "replacement" whenever audio is detected is just a darkened version

Code:
colorbars(pixel_type="yv12")
trim(0,300)
getchannel(2)
orig=last

replace=orig.levels(0,0.2,255,0,255,false)

ConditionalFilter(orig,replace,orig,"AudioRMS(0)", ">", "-50", show=true)
poisondeathray is offline   Reply With Quote
Old 20th March 2014, 08:43   #4  |  Link
raffriff42
Retried Guesser
 
raffriff42's Avatar
 
Join Date: Jun 2012
Posts: 1,373
@poisondeathray, good idea, but I have some refinements: variable transparency and a little decay time to cut down on flickering.
Code:
LoadPlugin("MinMaxAudio\Release\MinMaxAudio.dll")

A1=WavSource("a1.wav") ## uncompressed audio is a lot faster due to runtime analysis!
A2=WavSource("a2.wav")
#A3=...

AviSource("v.avi")

debug=true ## set to true for adjusting the mask windows

overlay_1 = Subtitle("VOICE ONE", x=24, y=32, size=56)
AudioLevelOverlay(A1, overlay_1, 
\  16, 26, 512, 80, showmask=debug)

overlay_2 = Subtitle("VOICE TWO", x=Width-524, y=32, size=56)
AudioLevelOverlay(A2, overlay_2, 
\  Width-532, 26, 512, 80, showmask=debug)

#overlay_3 = ...

#AudioDub(final_audio_mix)
return Last

##################################
### show overlay clip only when there is audio
### http://forum.doom9.org/showthread.php?p=1674312#post1674312
##
## @ C - base clip
## @ A - audio 
## @ O - overlay 
## @ x, y, wid, hgt - position & size of mask window
## @ boost - overall level boost (fudge factor) (default 18)
## @ gate - ignore audio under (-gate+boost) dB; (default 20)
##   NOTE "boost" and "gate" are shared among all instances of
##   this function (thanks Gavino). 
##   * if the overlay does not get fully opaque, increase boost;
##   * if the overlay shows up when it shouldn't, increase gate.
##   For example, I used boost=24, gate=12 on a muddy source.
## @ showmask - for setting window size & position
##
function AudioLevelOverlay(clip C, clip A, clip O, 
\            int x, int y, int wid, int hgt,
\            int "boost", int "gate", bool "showmask", string "mode")
{
    Assert(O.Width==C.Width && O.Height==C.Height, 
    \  "AudioLevelOverlay: overlay must be same size as base clip")
    global boost = Min(Max( 0, Default(boost, 18)), 24)
    global gate  = Min(Max(0, Default(gate,  20)), 60)
    showmask = Default(showmask, false)
    mode = Default(mode, "blend")

    AudioDub(C, A.AmplifyDB(-6).AudioEcho.Normalize(1))

    S = ScriptClip(Last.Crop(0, 0, wid, hgt), """
        x = Min(Max(0, Round(AudioRMS(0))+gate+boost), 255)*255/gate 
        return Last.BlankClip(color=to_rgb(x))""")
    M = Overlay(C.BlankClip, S, x=x, y=y).ConvertToY8

    return (showmask) 
    \ ? C.Overlay(M, opacity=0.5, mode="add")
    \ : C.Overlay(O, mask=M, mode=mode)
}

function AudioEcho(clip A, float "delay", float "mix") {
    delay = Min(Max(0.01, Float(Default(delay, 0.33))), 5.0)
    mix = Min(Max(0.0, Float(Default(mix, 0.33))), 1.0)
    return A.MixAudio(A.AudioTrim(0, delay)+A, (1.0-mix), mix)
}

function to_rgb(int r, int "g", int "b") {
    r = Min(Max(0, r), 255) ## thanks Gavino
    g = Min(Max(0, Default(g, r)), 255)
    b = Min(Max(0, Default(b, r)), 255)
    return (r*65536) + (g*256) + b 
}

Last edited by raffriff42; 22nd March 2014 at 05:11. Reason: changes in blue
raffriff42 is offline   Reply With Quote
Old 20th March 2014, 11:37   #5  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Neat idea, raffriff42 (and poisondeathray for the basic method).

Note that the global variables will cause problems if you ever want to call AudioLevelOverlay() more than once in a script with different values of 'gate' and/or 'boost'. A better way to pass arguments into a run-time script is to use the 'args' parameter of the GRunT run-time filters.

Also, the setting of the mask levels seems to be incorrect.
Quote:
Originally Posted by raffriff42 View Post
Code:
        x = Max(0, Round(AudioRMS(0))+gate+boost)*255/gate
If I understand, the intention is to make the overlay start to become visible for audio levels above -gate, and fully opaque when it reaches -boost. However, with this code it becomes visible at -(gate+boost) and the opacity overshoots 255 (and hence wraps around to zero) at -boost.
(Perhaps function to_rgb() should limit its arguments to 255).
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 20th March 2014, 12:44   #6  |  Link
ajk
Registered User
 
Join Date: Jan 2006
Location: Finland
Posts: 134
An alternate suggestion - You could use a plugin such as AudioGraph() to add a visual representation of the audio on each of the clips. With a bit of other AviSynth scripting you could make it as visible or unobtrusive as you prefer.
ajk is offline   Reply With Quote
Old 20th March 2014, 23:43   #7  |  Link
raffriff42
Retried Guesser
 
raffriff42's Avatar
 
Join Date: Jun 2012
Posts: 1,373
Thanks for the help, Gavino. I've fixed (in blue) some of the issues you mention, but not this one:
Quote:
Originally Posted by Gavino View Post
A better way to pass arguments into a run-time script is to use the 'args' parameter of the GRunT run-time filters.
I confess I have not tried GRunT. How do I get started with it?
For now, I have left the globals as they are, with a caveat.

Re: gate & boost, I messed around trying to meet your specification, but then I realized I was originally thinking of a microphone "boost" switch on an audio mixer, active *before* the noise gate. So your second wording is describes what the settings do, except the wraparound problem is fixed.
raffriff42 is offline   Reply With Quote
Old 21st March 2014, 00:32   #8  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by raffriff42 View Post
I confess I have not tried GRunT. How do I get started with it?
Basically, loading (or auto-loading) GRunT.dll replaces the built-in run-time filters (ScriptClip, etc) with extended versions that have a couple of extra arguments, providing (among other things) a simple, natural and robust way to pass variables into a run-time script from 'outside'.

In your function, you could use:
Code:
    S = ScriptClip(Last.Crop(0, 0, wid, hgt), """
        x = Min(Max(0, Round(AudioRMS(0))+gate+boost)*255/gate), 255) 
        return Last.BlankClip(color=to_rgb(x))""", args="gate,boost")
For more details, see the description in the GRunT thread (and the supplied doc), which should hopefully tell you all you need to know.
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 21st March 2014, 17:37   #9  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,493
Quote:
Originally Posted by ajk View Post
An alternate suggestion - You could use a plugin such as AudioGraph()
Or waveform, which supports more (and looks nicer with some) colour spaces and has a few extra features

David
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 21st March 2014, 19:38   #10  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
+1 on Waveform, much nicer.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 22nd March 2014, 04:56   #11  |  Link
raffriff42
Retried Guesser
 
raffriff42's Avatar
 
Join Date: Jun 2012
Posts: 1,373
Waveform works nicely - for example you could size & move each audio waveform under the matching speaker's names.

Here's a demo video for the script above: I'm no animator, but I managed to prepare a still "base" image and 2 "glow" versions, one for each speaker.

"Who's talking" effect - avisynth (youtu.be)
This task required Overlay mode="lighten" because the glow effects overlap one another, with no mask rectangle used (ie, rectangle = full screen), so script has a new "mode" option.

Another idea - modulate *position* instead of opacity; make an overlay "sprite" jiggle when someone talks.

Last edited by raffriff42; 18th March 2017 at 00:54. Reason: (fixed image link)
raffriff42 is offline   Reply With Quote
Reply

Tags
audio

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:48.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.