IAM
IAM

Reputation: 955

Python: write a wav file into numpy float array

ifile = wave.open("input.wav")

how can I write this file into a numpy float array now?

Upvotes: 24

Views: 69229

Answers (4)

Joran Beasley
Joran Beasley

Reputation: 113930

>>> from scipy.io.wavfile import read
>>> a = read("adios.wav")
>>> numpy.array(a[1],dtype=float)
array([ 128.,  128.,  128., ...,  128.,  128.,  128.])

Typically it would be bytes which are then ints... here we just convert it to float type.

You can read about read here: https://docs.scipy.org/doc/scipy/reference/tutorial/io.html#module-scipy.io.wavfile

Upvotes: 45

Andreas Prokopiou
Andreas Prokopiou

Reputation: 11

Don't have enough reputation to comment underneath @Matthew Walker 's answer, so I make a new answer to add an observation to Matt's answer. max_int16 should be 2**15-1 not 2**15.

Better yet, I think the normalization line should be replaced with:

audio_normalised = audio_as_np_float32 / numpy.iinfo(numpy.int16).max

If the audio is stereo (i.e. two channels) the left right values are interleaved, so to get the stereo array the following can be used :

channels = ifile.getnchannels()
audio_stereo = np.empty((int(len(audio_normalised)/channels), channels))
audio_stereo[:,0] = audio_normalised[range(0,len(audio_normalised),2)]
audio_stereo[:,1] = audio_normalised[range(1,len(audio_normalised),2)]

I believe this answers @Trees question in the comments section.

Upvotes: 0

Esterlinkof
Esterlinkof

Reputation: 1524

Use librosa package and simply load wav file to numpy array with:

y, sr = librosa.load(filename)

loads and decodes the audio as a time series y, represented as a one-dimensional NumPy floating point array. The variable sr contains the sampling rate of y, that is, the number of samples per second of audio. By default, all audio is mixed to mono and resampled to 22050 Hz at load time. This behavior can be overridden by supplying additional arguments to librosa.load().

More information at Librosa library documentation

Upvotes: 13

Matthew Walker
Matthew Walker

Reputation: 2757

Seven years after the question was asked...

import wave
import numpy

# Read file to get buffer                                                                                               
ifile = wave.open("input.wav")
samples = ifile.getnframes()
audio = ifile.readframes(samples)

# Convert buffer to float32 using NumPy                                                                                 
audio_as_np_int16 = numpy.frombuffer(audio, dtype=numpy.int16)
audio_as_np_float32 = audio_as_np_int16.astype(numpy.float32)

# Normalise float32 array so that values are between -1.0 and +1.0                                                      
max_int16 = 2**15
audio_normalised = audio_as_np_float32 / max_int16

Upvotes: 24

Related Questions