Reputation: 321
I try to use the librosa and pitch_shift from librosa. I recorded some my voice and used this code:
sampling_rate= 44100
y, sr = librosa.load(directory, sr=sampling_rate) # y is a numpy array of the wav file, sr = sample rate
y_shifted = librosa.effects.pitch_shift(y, sr, n_steps=4, bins_per_octave=24) # shifted by 4 half steps
librosa.output.write_wav(directory, y_shifted, sr=sampling_rate, norm=False)
It works fine - almost.
I hear some noise in my new voice (after pitch_shifting)
Is there something what I need to use?
Without shift:
https://vocaroo.com/i/s1qEEDvzcUHN
With shift (n_steps = 4):
https://vocaroo.com/i/s0cOiC0cFJSB
Upvotes: 1
Views: 7230
Reputation: 5310
Pitch-shifting typically involves an STFT, the shift—usually of a magnitude spectrum along the frequency axis, and then signal reconstruction via the Griffin-Lim-algorithm (Quora-explanation on how Griffin-Lim works).
The problem is that when we shift the magnitude spectrum, we do just that—and ignore the phase! Griffin-Lim tries to find a reasonable solution to find the correct phase when reconstructing the time domain signal, but it's often just that: a reasonable solution, not a perfect one. And that is why you hear this metallic twang. That's the phases of your signal not being quite right (also called "phasiness").
I believe your function call to librosa
is perfectly alright. It may just not be the greatest implementation on earth. Give PyRubberband a try. It's based on Rubberband (a C++ library) and has a good reputation.
Upvotes: 5