Adi
Adi

Reputation: 1

OpenAI's Whisper Error ("TypeError: transcribe() missing 1 required positional argument: 'audio'")

I am trying out OpenAI's whisper library in Python, but I am getting an error when I try to run it.

This was the first the error I was getting:

Traceback (most recent call last):
  File "/Users/---/PycharmProjects/Projects/whisper-test.py", line 7, in <module>
    result = whisper.transcribe(audio_data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: transcribe() missing 1 required positional argument: 'audio'

Update: After specifying the parameters in the whisper.transcribe() function, the error message changed slightly:

/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
  File "/Users/---/PycharmProjects/Projects/whisper-test.py", line 7, in <module>
    result = whisper.transcribe(model=model, audio=audio_data, verbose=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/transcribe.py", line 122, in transcribe
    mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/audio.py", line 141, in log_mel_spectrogram
    audio = torch.from_numpy(audio)
            ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected np.ndarray (got bytes)

This is my updated code:

import whisper

with open("hello.mp3", "rb") as audio_file:
    audio_data = audio_file.read()

model = whisper.load_model("base")
result = whisper.transcribe(model=model, audio=audio_data, verbose=True)

print(result["text"])

I have also tried with a .m4a file, and pasting the file url directly into the whisper.transcribe('audio') method, but they all return the same error.

Also, additional question: Does the Whisper library require an OpenAI API Key?

Upvotes: 0

Views: 1244

Answers (2)

Adi
Adi

Reputation: 1

Pass in the fp16=False parameter to the transcribe() function.

Demo code:

import whisper

model = whisper.load_model('base')
result = model.transcribe('audio.wav', fp16=False)

print(result['text'])

Thank you to this YouTube video for showing me this!

Upvotes: 0

Bad Gamer
Bad Gamer

Reputation: 15

Pass the .mp3 file directly without reading.

Upvotes: -1

Related Questions