log mel spectrogram using librosa

Question

I have come across 2 different ways of generating log-mel spectrograms for audio files using librosa and I don't know why they differ in the final output, which one is "correct" or how different is one from the other.

#1

path = "path/to/my/file"
scale, sr = librosa.load(path)
mel_spectrogram = librosa.feature.melspectrogram(scale, sr, n_fft=2048, hop_length=512, n_mels=10, fmax=8000)
log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)
librosa.display.specshow(log_mel_spectrogram, x_axis="time", y_axis="mel", sr=sr)

#2

path = "path/to/my/file"
scale, sr = librosa.load(path)
X = librosa.stft(scale)
Xdb = librosa.amplitude_to_db(abs(X))
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')

The respective images are:

** EDIT ** Now that I specify the number of mel bins to be = 64, I obtain the spectrogram as below:

If I want to process many such spectrograms, should I trim off the bold blue portion above as it is common to all? What does the bold, dark region represent? Is it advisable to use fmax parameter to trim it?

log mel spectrogram using librosa

Answers (1)

Related Questions