Reputation: 135

read `wav` file with `tf.audio.decode_wav`

I am following the tensorflow tutorial for audio recognition at simple_audio. The notebook works very well.

As a next step, I wanted to record my own voice and then run it through the model trained in tensorflow. I first generated a recording:

seconds=1
sr=16000
nchannels=1
myrecording = sd.rec(int(seconds * sr), samplerate=sr, channels=nchannels)
sd.wait()
wavfile.write(filename, sr, myrecording)

So far so good, I can play my recording. But when I try to load the file with tf.audio.decode_wav similar to this:

audio_binary = tf.io.read_file(filename)
audio, _ = tf.audio.decode_wav(audio_binary)

I get the following error:

InvalidArgumentError: Bad audio format for WAV: Expected 1 (PCM), but got3 [Op:DecodeWav]

Any pointers on what might be going wrong are greatly appreciated.

Upvotes: 1

Answers (3)

Angelo Cardellicchio

Reputation: 416

The scipy.wavfile.write function, which appears you are using, does not automatically save wav files in a 16 bits format. Hence, from the example in the reference, you should do something like that:

import numpy as np
from scipy.wavfile import write

# your other code here
write(filename, sr, myrecording.astype(np.int16))

Upvotes: 0

Yonatan Alon

Reputation: 140

(Would have written this as a comment, but I don't have enough reputation yet)

The default encoding for WAV files is called "16 bit PCM", which means the recorded sound is represented using 16-bit int data before it is written to your WAV file.

tf.audio.decode_wav() states in the documentation: "Decode a 16-bit PCM WAV file to a float tensor". Thus passing a WAV file using any other encoding (24-bit encoding, in your case) would result in an error like the one you received.

Upvotes: 5

vbfh

Reputation: 135

Finally resolved it. It has to do with the bit representation. I was creating a file in 24-bit while for some reason tf.audio.decode_wav only takes 16-bit files.

It is not clear to me why, but marking this as solved for now.

Upvotes: 0

read `wav` file with `tf.audio.decode_wav`

Answers (3)

Related Questions