IamRichter
IamRichter

Reputation: 15

Convert PCM WAV to normal WAV in python

I was using speech_recognition with a wav file that pjsua recorded, and it always ends with an error msg when I try to send the content of the file.

Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format

The file plays normally using MPV, and inspecting the file show that it's a PCM (I used the file command).

test2.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 16000 Hz

Looking up, I found a guy with a similar problem, the proposed solution (change a few parameters using the wave library) did not work out to me. After I use the wav.setparams((2, 2, 44100, 0, 'NONE', 'NONE')) the audio became complete garbage, like ant's talking.

I really don't understand enough about sound files to understand what the "channels", "sampwidth", "framerate", "nframes", "comptype" and "compname" means...

Upvotes: 0

Views: 2179

Answers (1)

Paul M.
Paul M.

Reputation: 10819

You've misunderstood the error message. PCM is intrinsic to the wave file format. There is no "PCM" version, and then a "Normal" version - The wave file format always uses Pulse Code Modulation (PCM) - that really just means that the samples that make up your signal are quantized digitally and contiguous. If your speech_recognition function can't parse the wave file, it's not because of anything related to PCM.

I don't know anything about the SpeechRecognition module (I'm assuming that's what you're using?). I also don't know anything about pjsua. My guess is that pjsua is possibly baking in some additional chunks in the header meta-data, which the SpeechRecognition API isn't expecting. Is there any chance you can share the wave file via dropbox, etc?

Also, the reason your audio sounded like "ants talking" is because of the discrepency between the meta-data your wave file contains, and the meta-data you wrote to your new wave file. Your wave file is mono - that means one channel, you wrote two. Your file also has a samplerate of 16khz, but you wrote 44.1khz.

Upvotes: 2

Related Questions