Reputation: 1
I am trying out OpenAI's whisper library in Python, but I am getting an error when I try to run it.
This was the first the error I was getting:
Traceback (most recent call last):
File "/Users/---/PycharmProjects/Projects/whisper-test.py", line 7, in <module>
result = whisper.transcribe(audio_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: transcribe() missing 1 required positional argument: 'audio'
Update: After specifying the parameters in the whisper.transcribe()
function, the error message changed slightly:
/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "/Users/---/PycharmProjects/Projects/whisper-test.py", line 7, in <module>
result = whisper.transcribe(model=model, audio=audio_data, verbose=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/transcribe.py", line 122, in transcribe
mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/---/PycharmProjects/Projects/venv/lib/python3.11/site-packages/whisper/audio.py", line 141, in log_mel_spectrogram
audio = torch.from_numpy(audio)
^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected np.ndarray (got bytes)
This is my updated code:
import whisper
with open("hello.mp3", "rb") as audio_file:
audio_data = audio_file.read()
model = whisper.load_model("base")
result = whisper.transcribe(model=model, audio=audio_data, verbose=True)
print(result["text"])
I have also tried with a .m4a file, and pasting the file url directly into the whisper.transcribe('audio')
method, but they all return the same error.
Also, additional question: Does the Whisper library require an OpenAI API Key?
Upvotes: 0
Views: 1244
Reputation: 1
Pass in the fp16=False
parameter to the transcribe()
function.
Demo code:
import whisper
model = whisper.load_model('base')
result = model.transcribe('audio.wav', fp16=False)
print(result['text'])
Thank you to this YouTube video for showing me this!
Upvotes: 0