Python : How to use speech_recognition or other modules to convert base64 audio string to text?

Question

I have base64 audio string like data:audio/mpeg;base64,//OAxAAAAANIAAAAABhqZ3f4StN3gOAaB4NAUBYZLv......, and I was trying to convert the base64 to wav file using base64 module in Python:

    decode_bytes = base64.b64decode(encoding_str)
    with open(file_name + '.wav', "wb") as wav_file:
        wav_file.write(decode_bytes)

Then I was trying to convert the audio to text using speech_recognition module, and it gives error below:

ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format

Is there a solution for this problem?

Mayur Deshmukh · Accepted Answer

Seems like your audio file is mp3 from mime-type - audio/mpeg. You will need to save it as mp3

decode_bytes = base64.b64decode(encoding_str)
    with open(file_name + '.mp3', "wb") as wav_file:
        wav_file.write(decode_bytes)

And convert the mp3 to wav format either using pydub or FFmpeg and then give this wav file to speech_recognition module.

Python : How to use speech_recognition or other modules to convert base64 audio string to text?

Answers (1)

Related Questions