Reputation: 51
I'm trying to figure out why Mel scale spectrogram seems to have the wrong frequency scale. I generate a 4096Hz tone and plot it using librosa's display library, and the tone does not align with the known frequency? I'm obviously doing something wrong, can someone help? Thanks!
import numpy as np
import librosa.display
import matplotlib.pyplot as plt
sr = 44100
t = np.linspace(0, 1, sr)
y = 0.1 * np.sin(2 * np.pi * 4096 * t)
M = librosa.feature.melspectrogram(y=y, sr=sr)
M_db = librosa.power_to_db(M, ref=np.max)
librosa.display.specshow(M_db, y_axis='mel', x_axis='time')
plt.show()
Upvotes: 0
Views: 1760
Reputation: 5310
When you compute the mel spectrogram using librosa.feature.melspectrogram(y=y, sr=sr)
you implicitly create a mel filter using the parameters fmin=0
and fmax=sr/2
(see docs here). To correctly plot the spectrogram, librosa.display.specshow
needs to know how it was created, i.e. what sample rate sr
was used (to get the time axis right) and what frequency range was used to get the frequency axis right. While librosa.feature.melspectrogram
defaults to 0 - sr/2
, librosa.display.specshow
unfortunately defaults to 0 - 11050
(see here). This describes librosa 0.8—I could imagine this changes in the future.
To get this to work correctly, explicitly add fmax
parameters. To also get the time axis right, add the sr
parameter to librosa.display.specshow
:
import numpy as np
import librosa.display
import matplotlib.pyplot as plt
sr = 44100
t = np.linspace(0, 1, sr)
y = 0.1 * np.sin(2 * np.pi * 4096 * t)
M = librosa.feature.melspectrogram(y=y, sr=sr, fmax=sr/2)
M_db = librosa.power_to_db(M, ref=np.max)
librosa.display.specshow(M_db, sr=sr, y_axis='mel', x_axis='time', fmax=sr/2)
plt.show()
Upvotes: 1