Muhammad Umer Javaid
Muhammad Umer Javaid

Reputation: 47

How to convert a wav file into RGB image with melspectogram?

I am working on sound classification with wav files ranging from 1 second to 4 second. i want to convert wav to 224x224x3 image that i can fee into Resnet for classification The conversion should be using melspectogram Thanks for help

Upvotes: 2

Views: 2776

Answers (1)

Lukasz Tracewski
Lukasz Tracewski

Reputation: 11377

You can use librosa to produce mel spectrogram like this:

import librosa
import librosa.display
import numpy as np
import matplotlib.pyplot as plt

y, sr = librosa.load(librosa.util.example_audio_file()) # your file
S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=8000)
librosa.display.specshow(librosa.power_to_db(S, ref=np.max), fmax=8000)
plt.savefig('mel.png')

Mind though that these are false colours, RGB does not make sense here - nor any multi-channel. Use architecture that works with a single channel.

Upvotes: 1

Related Questions