Reputation: 135
I am following the tensorflow tutorial for audio recognition at simple_audio. The notebook works very well.
As a next step, I wanted to record my own voice and then run it through the model trained in tensorflow. I first generated a recording:
seconds=1
sr=16000
nchannels=1
myrecording = sd.rec(int(seconds * sr), samplerate=sr, channels=nchannels)
sd.wait()
wavfile.write(filename, sr, myrecording)
So far so good, I can play my recording. But when I try to load the file with tf.audio.decode_wav
similar to this:
audio_binary = tf.io.read_file(filename)
audio, _ = tf.audio.decode_wav(audio_binary)
I get the following error:
InvalidArgumentError: Bad audio format for WAV: Expected 1 (PCM), but got3 [Op:DecodeWav]
Any pointers on what might be going wrong are greatly appreciated.
Upvotes: 1
Views: 5209
Reputation: 416
The scipy.wavfile.write
function, which appears you are using, does not automatically save wav files in a 16 bits format. Hence, from the example in the reference, you should do something like that:
import numpy as np
from scipy.wavfile import write
# your other code here
write(filename, sr, myrecording.astype(np.int16))
Upvotes: 0
Reputation: 140
(Would have written this as a comment, but I don't have enough reputation yet)
The default encoding for WAV files is called "16 bit PCM", which means the recorded sound is represented using 16-bit int data before it is written to your WAV file.
tf.audio.decode_wav()
states in the documentation: "Decode a 16-bit PCM WAV file to a float tensor". Thus passing a WAV file using any other encoding (24-bit encoding, in your case) would result in an error like the one you received.
Upvotes: 5
Reputation: 135
Finally resolved it. It has to do with the bit representation. I was creating a file in 24-bit
while for some reason tf.audio.decode_wav
only takes 16-bit files.
It is not clear to me why, but marking this as solved for now.
Upvotes: 0