Alexander Endresen
Alexander Endresen

Reputation: 1

How do visualise gini index from a mel spectogram?

i have plotted a spectogram from an audio file imported with librosa and converted it to a logarithmic scale:

mel_spectrogram = librosa.feature.melspectrogram(y=audio_data, sr=sr, n_fft=n_fft, hop_length=hop_length, n_mels=n_mels)

# 3. Convert to decibel scale (logarithmic scale)
mel_spectrogram_db = librosa.power_to_db(mel_spectrogram, ref=np.max)

# 4. Create a DataFrame from the Mel spectrogram
df = pd.DataFrame(mel_spectrogram_db)

I have then found the minimum value, subtracted all numbers in the dataframe by this value, and squared it. I then tried performing a gini index using:

def gini_index(data):
    n = len(data)
    if n == 0:
        return 0
    mean = np.mean(data)
    if np.sum(data) == 0:  # Check if the sum is zero to avoid division by zero
        return 0
    sorted_data = np.sort(data)  # Sort the data
    cumulative_values = np.cumsum(sorted_data)  # Cumulative sum of the sorted data
    gini = (2 * np.sum(cumulative_values) / np.sum(data) - (n + 1)) / n  # Gini calculation
    return gini

I have then "tried" to split it all into different blocks and apply the gini index to each block:

# Parameters for blocks
block_rows = 10  # Number of rows in each block (frequency bands)
block_cols = 20 # Number of columns in each block (time frames)

# Calculate the number of blocks in each dimension
num_blocks_row = mel_spectrogram_db.shape[0] // block_rows  # Number of vertical blocks
num_blocks_col = mel_spectrogram_db.shape[1] // block_cols  # Number of horizontal blocks

# Initialize a 2D array to hold the Gini indices for each block
gini_indices = np.zeros((num_blocks_row, num_blocks_col))

# Calculate Gini index for each block
for i in range(num_blocks_row):
    for j in range(num_blocks_col):
        # Extract the current block
        block = mel_spectrogram_db[i * block_rows:(i + 1) * block_rows, j * block_cols:(j + 1) * block_cols]
        # Flatten the block and calculate the Gini index
        gini_indices[i, j] = gini_index(block.flatten())

However, it seems my plot for the gini index heatmap is inverted or something. On the mel spectogram i can clearly see brighter areas and patterns of the sounds i am trying to isolate. But using the gini index to try and isolate these sounds is not working, as it is not highlighting these sounds!

Stated in the question

Upvotes: 0

Views: 15

Answers (0)

Related Questions