user5723263
user5723263

Reputation:

How would I pitch a wav-file using Python

So I am currently trying to pitch a wav-File without having it speed up. I am using Scipy and Numpy to get the original Data of the Wave File and tried to pitch it with just adding +2 or so to the Original Data what I am recieving after creating the new file is a totally messed up Wave-File.

Here is what I've got for now:

from scipy.io.wavfile import read as wavr
from scipy.io.wavfile import write as wavw
import numpy as np




rate, data = wavr('hit.wav')    

idk = data[:]

R = []
L = []

for thing in idk:
    old1 = thing[0]  + 2.0
    old2 = thing[1]  + 2.0
    R.append(old1)
    L.append(old2)
    print(len(L))
    print(len(R))

Right= np.array(R)
Left= np.array(L) 
help = np.column_stack((Right,Left))    

print(rate)


wavw("copied.wav",44100,help)

Maybe my current attempts are false in any case if so could you guys tell me what to do in order to accomplish my goal

Upvotes: 0

Views: 2520

Answers (1)

Adrien Luxey
Adrien Luxey

Reputation: 562

My dear, you are in big trouble (I've been there).

First of all, I think all wav files use floats, that don't exceed a certain threshold, say the values are between -1 and 1. So, when you add "just +2" to your file, you're in fact saturating everything.

Then, there is something you got wrong about the music theory. You may know that sound is a wave. More precisely, when you record a 440Hz note (the A4 note that musicians use to tune their instruments), what you get is a sine wave (like this one) oscillating 440 times per second. What's more, it can be centered around 0 or not, but the sound won't be any different (which means adding or removing a constant to the wave is pointless; the only thing you might do is saturate the signal).

So, to change the pitch of a sound, you have to modify its frequency: the number of times it oscillates per second. In music, this is called a vocoder. The theory behind it is quite hardcore already; it's from a science field called signal processing. More precisely, a (phase) vocoder uses:

  • Fourier Transforms (this tutorial seems quite cool, otherwise Wikipedia, but it's math-oriented). Given a sample of sound, the Fourier transform gives the frequencies that it contains (a little bit of A4, a lot of C3, etc.). (More precisely, you need a Short Time Fourier Transform for this particular problem);
  • The overlap-add algorithm, that is used to make Fourier transforms on chunks of your file, modify the pitch on each chunk, and paste all the chunks back together. This is where I totally failed (and I've been trying to code a vocoder twice).

I could go on longer, but I guess you got the point: anything that touches sound frequency (like the pitch) is filled with heavy math. Signal processing is a very interesting field, but it's quite hard to dive into it. If you have good enough math skills, you can have a look at some courses online, here Stanford's. If you just want to play around with music and code, use existing tools, not raw data (I think of Processing, or existing python libraries).

If you want an easier exercise to do stuff with wav files, try modifying the volume or the speed (be aware it will modify the pitch, though). For the volume, you will have to multiply (not add) a constant to every sample of the file. For the speed, you can remove one sample out of two to make the audio twice faster, or increase the size of your wave array, and find a good solution to put a relevant sample between each existing one (zero? the previous sample? Try out stuff!).

Good luck!

Upvotes: 3

Related Questions