Reputation: 41
I have a .wav file with a sample rate of 44.1khz, I want to resample it into 16khz by using librosa.resample. Though the output.wav sounds great, and it is 16khz, but I got an error when I'm trying to read it by wave.open.
and this problem is quite similar to mine: Opening a wave file in python: unknown format: 49. What's going wrong?
This is my code:
if __name__ == "__main__":
input_wav = '1d13eeb2febdb5fc41d3aa7db311fa33.wav'
output_wav = 'result.wav'
y, sr = librosa.load(input_wav, sr=None)
print(sr)
y = librosa.resample(y, orig_sr=sr, target_sr=16000)
librosa.output.write_wav(output_wav, y, sr=16000)
wave.open(output_wav)
And I got error in the last step wave.open(output_wav)
The Exception is following:
Traceback (most recent call last):
File "/Users/range/Code/PycharmProjects/Speaker/test.py", line 204, in <module>
wave.open(output_wav)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/wave.py", line 499, in open
return Wave_read(f)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/wave.py", line 163, in __init__
self.initfp(f)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/wave.py", line 143, in initfp
self._read_fmt_chunk(chunk)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/wave.py", line 260, in _read_fmt_chunk
raise Error('unknown format: %r' % (wFormatTag,))
wave.Error: unknown format: 3
I just don't know why can't wave.open read the wav_flie, and I have to resample the wav to do my further work.
I wonder if the librosa.output.write
changed the type of wav.
So I have to write the resample function by myself. Fortunately, it works. This is my code:
def resample(input_wav, output_wav, tar_fs=16000):
audio_file = wave.open(input_wav, 'rb')
audio_data = audio_file.readframes(audio_file.getnframes())
audio_data_short = np.fromstring(audio_data, np.short)
src_fs = audio_file.getframerate()
dtype = audio_data_short.dtype
audio_len = len(audio_data_short)
audio_time_max = 1.0*(audio_len-1) / src_fs
src_time = 1.0 * np.linspace(0, audio_len, audio_len) / src_fs
tar_time = 1.0 * np.linspace(0, np.int(audio_time_max*tar_fs), np.int(audio_time_max*tar_fs)) / tar_fs
output_signal = np.interp(tar_time, src_time, audio_data_short).astype(dtype)
with wave.open(output_wav, 'wb') as f:
f.setnchannels(1)
f.setsampwidth(2)
f.setframerate(tar_fs)
f.writeframes(output_signal)
I hope if you can help me understand what's wrong when resampling the wav by librosa, and I'm glad to see my code can help other people who have the same problem. :)
Upvotes: 4
Views: 2786
Reputation: 482
(For Linux users): An alternative to sox since I couldn't use it. But I'm successfully convert it with ffmpeg on terminal by using the command:
ffmpeg -i input_wav.wav -ar 44100 -ac 1 -acodec pcm_s16le output_wav.wav
where "ar" = audio rate, and "ac" = audio channels.
Upvotes: 2
Reputation: 83
I was working on a project and had the same error so dug in a bit and found that the issue is due to the default way in which librosa writes the wave file using write_wav() in the output module.
The problem is that the encoding quantification is 24 bit since it is "Floating Point PCM". You can change bit quantification easily by using SoX. SoX is cross-platform command line utility which you can use to control specifics like the encoding format.
For example, you would do something like this to go from 24 bit encoding to 16 bit encoding:
sox audio.wav -b 16 -e signed-integer modified_audio.wav
Upvotes: 1