Reputation: 111
I am trying to do naive volume adjustment of a sound file. I am using python 2.7 and the following
libraries:
import numpy as np
import scipy.io.wavfile as wv
import matplotlib.pyplot as plt
import pyaudio
import wave
I have tried 2 approaches, I am trying to amplify the sound by 2, ie. n=2. The first is a altered dyanmic range limiter approach from here (http://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html):
def limiter(self, n):
#best version so far
signal=self.snd_array
attack_coeff = 0.01
framemax=2**15-1
threshold=framemax
for i in np.arange(len(signal)):
#if amplitude value * amplitude gain factor is > threshold set an interval to decrease the amplitude
if signal[i]*n > threshold:
gain=1
jmin=0
jmax=0
if i-100>0:
jmin=i-100
else:
jmin=0
if i+100<len(signal):
jmax=i+100
else:
jmax=len(signal)
for j in range(jmin,jmax):
#target gain is amplitude factor times exponential to smoothly decrease the amp factor (n)
target_gain = n*np.exp(-10*(j-jmin))
gain = (gain*attack_coeff + target_gain*(1-attack_coeff))
signal[j]=signal[j]*gain
else:
signal[i] = signal[i]*n
print max(signal),min(signal)
plt.figure(3)
plt.plot(signal)
return signal
The second is a method where I do hard knee compression to decrease the amplitude of sound values above a threshold and then I amplify the whole signal by an amplitude gain factor.
def compress(self,n):
print 'start compress'
threshold=2**15/n+1000
#compress all values above the threshold, therefore limiting the audio amplitude range
for i in np.arange(len(self.snd_array)):
if abs(self.snd_array[i])>threshold:
factor=1+(threshold-abs(self.snd_array[i]))/threshold
else:
factor=1.0
#apply compression factor and amp gain factor (n)
self.snd_array[i] = self.snd_array[i]*factor*n
print np.min(self.snd_array),np.max(self.snd_array)
plt.figure(2)
plt.plot(self.snd_array,'k')
return self.snd_array
In both methods the file sounds distorted. At points whose amplitudes are near the threshold the music sounds clipped and crackly. I think this is because it "flattens" out near the threshold value. I tried applying an exponential in the limiter function but it does not remove the crackling sound completely even when I make it decrease very quickly. If I change n=1.5 the sound is not distorted. If anyone could give me any pointers on how to remove crackling distortion or links to other volume modulation code that would be much appreciated.
Upvotes: 2
Views: 10978
Reputation: 3930
It might not be 100% on topic, but maybe this is interesting for you anyway. If you do not need to do real time processing, things can be made more easy. Limiting and dynamic compression can be seen as applying a dynamic transfer function. This function just maps input to output values. A linear function then returns the original audio and a "curved" function does compression or expansion. Applying a transfer function is as simple as
import numpy as np
from scipy.interpolate import interp1d
from scipy.io import wavfile
def apply_transfer(signal, transfer, interpolation='linear'):
constant = np.linspace(-1, 1, len(transfer))
interpolator = interp1d(constant, transfer, interpolation)
return interpolator(signal)
Limiting or compression then is just a case of choosing a different transfer function:
# hard limiting
def limiter(x, treshold=0.8):
transfer_len = 1000
transfer = np.concatenate([ np.repeat(-1, int(((1-treshold)/2)*transfer_len)),
np.linspace(-1, 1, int(treshold*transfer_len)),
np.repeat(1, int(((1-treshold)/2)*transfer_len)) ])
return apply_transfer(x, transfer)
# smooth compression: if factor is small, its near linear, the bigger it is the
# stronger the compression
def arctan_compressor(x, factor=2):
constant = np.linspace(-1, 1, 1000)
transfer = np.arctan(factor * constant)
transfer /= np.abs(transfer).max()
return apply_transfer(x, transfer)
This example assumes 16 bit mono wav files as input:
sr, x = wavfile.read("input.wav")
x = x / np.abs(x).max() # x scale between -1 and 1
x2 = limiter(x)
x2 = np.int16(x2 * 32767)
wavfile.write("output_limit.wav", sr, x2)
x3 = arctan_compressor(x)
x3 = np.int16(x3 * 32767)
wavfile.write("output_comp.wav", sr, x3)
Maybe this clean offline code helps you to benchmark your realtime code.
Upvotes: 12