The Dark Bug Returns
The Dark Bug Returns

Reputation: 111

How to amplify sounds without distortion in Python

I am trying to do naive volume adjustment of a sound file. I am using python 2.7 and the following

libraries:

import numpy as np

import scipy.io.wavfile as wv

import matplotlib.pyplot as plt

import pyaudio  

import wave  

I have tried 2 approaches, I am trying to amplify the sound by 2, ie. n=2. The first is a altered dyanmic range limiter approach from here (http://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html):

def limiter(self, n):

    #best version so far

    signal=self.snd_array

    attack_coeff = 0.01

    framemax=2**15-1

    threshold=framemax

    for i in np.arange(len(signal)):

    #if amplitude value * amplitude gain factor is > threshold set an interval to decrease the amplitude            

        if signal[i]*n > threshold:

            gain=1

            jmin=0

            jmax=0                

            if i-100>0: 

                jmin=i-100

            else:

                jmin=0

            if i+100<len(signal):

                jmax=i+100

            else:

                jmax=len(signal)

            for j in range(jmin,jmax):    

                #target gain is amplitude factor times exponential to smoothly decrease the amp factor (n)

                target_gain = n*np.exp(-10*(j-jmin))

                gain = (gain*attack_coeff + target_gain*(1-attack_coeff))

                signal[j]=signal[j]*gain

        else:

            signal[i] = signal[i]*n

    print max(signal),min(signal)

    plt.figure(3)

    plt.plot(signal)

    return signal

The second is a method where I do hard knee compression to decrease the amplitude of sound values above a threshold and then I amplify the whole signal by an amplitude gain factor.

def compress(self,n):

     print 'start compress'

     threshold=2**15/n+1000

     #compress all values above the threshold, therefore limiting the audio amplitude range

     for i in np.arange(len(self.snd_array)):         

         if abs(self.snd_array[i])>threshold:

             factor=1+(threshold-abs(self.snd_array[i]))/threshold

         else:

             factor=1.0

     #apply compression factor and amp gain factor (n)

         self.snd_array[i] = self.snd_array[i]*factor*n

     print np.min(self.snd_array),np.max(self.snd_array)

     plt.figure(2)

     plt.plot(self.snd_array,'k')

     return self.snd_array

In both methods the file sounds distorted. At points whose amplitudes are near the threshold the music sounds clipped and crackly. I think this is because it "flattens" out near the threshold value. I tried applying an exponential in the limiter function but it does not remove the crackling sound completely even when I make it decrease very quickly. If I change n=1.5 the sound is not distorted. If anyone could give me any pointers on how to remove crackling distortion or links to other volume modulation code that would be much appreciated.

Upvotes: 2

Views: 10978

Answers (1)

Frank Zalkow
Frank Zalkow

Reputation: 3930

It might not be 100% on topic, but maybe this is interesting for you anyway. If you do not need to do real time processing, things can be made more easy. Limiting and dynamic compression can be seen as applying a dynamic transfer function. This function just maps input to output values. A linear function then returns the original audio and a "curved" function does compression or expansion. Applying a transfer function is as simple as

import numpy as np
from scipy.interpolate import interp1d
from scipy.io import wavfile

def apply_transfer(signal, transfer, interpolation='linear'):
    constant = np.linspace(-1, 1, len(transfer))
    interpolator = interp1d(constant, transfer, interpolation)
    return interpolator(signal)

Limiting or compression then is just a case of choosing a different transfer function:

# hard limiting
def limiter(x, treshold=0.8):
    transfer_len = 1000
    transfer = np.concatenate([ np.repeat(-1, int(((1-treshold)/2)*transfer_len)),
                                np.linspace(-1, 1, int(treshold*transfer_len)),
                                np.repeat(1, int(((1-treshold)/2)*transfer_len)) ])
    return apply_transfer(x, transfer)

# smooth compression: if factor is small, its near linear, the bigger it is the
# stronger the compression
def arctan_compressor(x, factor=2):
    constant = np.linspace(-1, 1, 1000)
    transfer = np.arctan(factor * constant)
    transfer /= np.abs(transfer).max()
    return apply_transfer(x, transfer)

This example assumes 16 bit mono wav files as input:

sr, x = wavfile.read("input.wav")
x = x / np.abs(x).max() # x scale between -1 and 1

x2 = limiter(x)
x2 = np.int16(x2 * 32767)
wavfile.write("output_limit.wav", sr, x2)

x3 = arctan_compressor(x)
x3 = np.int16(x3 * 32767)
wavfile.write("output_comp.wav", sr, x3)

Maybe this clean offline code helps you to benchmark your realtime code.

Upvotes: 12

Related Questions