hama Mk
hama Mk

Reputation: 127

How to create a spectrogram image from an audio file in Python just like how FFMPEG does?

My code:

import matplotlib.pyplot as plt
from matplotlib.pyplot import specgram
import librosa
import librosa.display
import numpy as np
import io
from PIL import Image

samples, sample_rate = librosa.load('thabo.wav')
fig = plt.figure(figsize=[4, 4])
ax = fig.add_subplot(111)
ax.axes.get_xaxis().set_visible(False)
ax.axes.get_yaxis().set_visible(False)
ax.set_frame_on(False)
S = librosa.feature.melspectrogram(y=samples, sr=sample_rate)
librosa.display.specshow(librosa.power_to_db(S, ref=np.max))
buf = io.BytesIO()
plt.savefig(buf,  bbox_inches='tight',pad_inches=0)

# plt.close('all')
buf.seek(0)
im = Image.open(buf)
# im = Image.open(buf).convert('L')
im.show()
buf.close()

Spectrogram produced

enter image description here

Using FFMPEG

ffmpeg -i thabo.wav -lavfi showspectrumpic=s=224x224:mode=separate:legend=disabled spectrogram.png

Spectrogram produced

enter image description here

Please help, i want a spectrogram that is exactly the same as the one produced by FFMPEG, for use with a speech recognition model exported from google's teachable machine. Offline recognition

Upvotes: 1

Views: 2326

Answers (1)

llogan
llogan

Reputation: 133723

You can directly pipe the audio to ffmpeg which will avoid the intermediate file, and ffmpeg can output to pipe as well if you wanted to avoid image file output.

Demonstration using three instances of ffmpeg:

ffmpeg -i input.wav -f wav - | ffmpeg -i - -filter_complex "showspectrumpic=s=224x224:mode=separate:legend=disabled" -c:v png -f image2pipe - | ffmpeg -y -i - output.png

The first and last ffmpeg instances of course will be replaced with your particular processes for your workflow.

Upvotes: 1

Related Questions