Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Audio encoding

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th May 2023, 04:55   #1  |  Link
SilSinn9801
Chiptuner & VapourSynther
 
SilSinn9801's Avatar
 
Join Date: Mar 2019
Location: Scarlet Devil Mansion, Gensōkyō
Posts: 52
How to do nearest-neighbor audio upsampling in FFmpeg?

Hello,

I have hundreds of one-minute-long MP4 files ripped from a WYZE security camera, & their monaural audio is encoded in A-Law ADPCM at 16000 Hz. I want to batch-decode the audio from all those 100s of MP4s to PCM (16-bit signed, little endian) & then batch-upsample from 16000 to 48000 Hz WITHOUT interpolation of any kind – that is, nearest-neighbor upsampling (instead of linear, cubic, or any other form of sample interpolation), & because 48000 is an integer multiple of 16000 (3× to be exact), the output should have no rounding errors. Effectively, I want each input sample value to repeat three times at the output before moving from one audio sample point to the next. The question is, how do I batch-do this in FFmpeg?

For context: trying FFmpeg’s -ar 48000 option always results in linear interpolation (or something close to it) between sample points, & trying instead
Code:
-af aresample=isr=16000:linear_interp=0:osr=48000
also always results in linear upsampling (in spite of the linear_interp=0 part, unless that part means a different thing).
__________________
SilSinn9801 a.k.a. Silent Sinner in Scarlet
Discord: silsinn9801
Matrix: silsinn9821:matrix.org
YouTube: https://youtube.com/SilentSinnerInScarlet
ニコニコ動画: https://nicovideo.jp/user/68029427

Last edited by SilSinn9801; 30th May 2023 at 05:13. Reason: adding for context the behavior of both "-ar 48000" & "-af aresample" options in FFmpeg; fixing formatting from [CODE/] to [FONT/]
SilSinn9801 is offline   Reply With Quote
Old 30th May 2023, 13:12   #2  |  Link
richardpl
Registered User
 
Join Date: Jan 2012
Posts: 272
Such upsampling for audio would sound worse than that original not-upsampled version.
Correct approach to do minimal upsample is to just put every 2nd and 3d sample to zero, which would reflect all original lower frequencies also into higher frequencies in upsampled version.
Than optionally just do lowpass filtering - which can be done in many ways - depending on speed/quality ratio.
richardpl is offline   Reply With Quote
Old 30th May 2023, 16:41   #3  |  Link
Kisa_AG
Registered User
 
Join Date: Sep 2005
Location: Moscow, Russia
Posts: 65
Quote:
Originally Posted by SilSinn9801 View Post
Effectively, I want each input sample value to repeat three times at the output before moving from one audio sample point to the next. The question is, how do I batch-do this in FFmpeg?
ffmpeg.exe -i Input.wav -af aresample=resampler=swr:filter_size=1:phase_shift=0:linear_interp=0 -ar 48000 -c:a pcm_s16le Out.wav

It seems to me that it makes exactly what you've said.
But I don't think you'll like the result :)

Screenshot
Attached Images
 

Last edited by Kisa_AG; 30th May 2023 at 16:57.
Kisa_AG is offline   Reply With Quote
Old 30th May 2023, 19:17   #4  |  Link
SilSinn9801
Chiptuner & VapourSynther
 
SilSinn9801's Avatar
 
Join Date: Mar 2019
Location: Scarlet Devil Mansion, Gensōkyō
Posts: 52
Thankee.

If the result you’re talking about is a crunchy sound (rather than a smooth sound), yes I know that, but in the chiptune oscilloscope-view community, in many cases such crunchiness is a desired trait, especially for oscilloscope visualization.

To make an analogy or two:
Crunchy (nearest-neighbor-upsampled) sound
is to
Smooth (linear- or sinc-interpolation-upsampled) sound
what
Pixelated (nearest-neighbor-upscaled) image
is to
High-def (bilinear-/bicubic-/Lanczos-/etc.-upscaled) image
& also what
Repeated frames (in film-to-50/60fps upconversion)
are to
Motion-interpolated (or motion-compensated) frames.

Which one of them is desired depends on the target application as well as the user’s preferential tastes.
__________________
SilSinn9801 a.k.a. Silent Sinner in Scarlet
Discord: silsinn9801
Matrix: silsinn9821:matrix.org
YouTube: https://youtube.com/SilentSinnerInScarlet
ニコニコ動画: https://nicovideo.jp/user/68029427

Last edited by SilSinn9801; 30th May 2023 at 19:31. Reason: Making some analogies
SilSinn9801 is offline   Reply With Quote
Old 1st June 2023, 07:30   #5  |  Link
junh1024
Registered User
 
Join Date: Mar 2011
Posts: 59
Silsinn, your analogies aren't accurate. Video is a sampled discrete signal, but audio is meant to be a continuously changing signal. So, if you hold the sample value like NN, you'll actually change the signal in ways you probably don't want, as Kisa's picture shows.

If you want it to sound the same, most SRC defaults should do. If you want to add harmonics, try linear.
junh1024 is offline   Reply With Quote
Old 1st June 2023, 19:50   #6  |  Link
pandy
Registered User
 
Join Date: Mar 2006
Posts: 1,049
Both methods are fine, Zero Order Hold can be used for upsampling.
pandy is offline   Reply With Quote
Reply

Tags
ffmpeg audio, interpolation, nearest neighbor, nearest neighbour, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:26.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.