user2417339
user2417339

Reputation:

Normalize each individual frequency after FFT

I am writing a program that is supposed to visually display the frequencies that make up audio. To do this I have a rolling window that I perform an FFT on (I'll post code at bottom). What interested me is that all my plots have a high power near the 0hz side and a low power near the 10khz side: enter image description here enter image description here

I want to change the graph so that instead of showing the actual power it shows the power relative to the power at the same frequency to the rest of the song. For example, I want to have it so that the maximum value at the 0hz is the same as the maximum value at the 10khz. This would mean reducing the y value at low frequencies and raising it at high frequencies. How would I go about doing this and make it look like the graph isn't on a downward slope?

As a side note after I get the algorithm working I am going to stream the audio rather than reading it from a audio file so that may eliminate the possibility of keeping an average of each frequency throughout the entire song.

% Prototype to graph moving window of FFT through an audio file in real time
clear all;
warning('off','MATLAB:colon:nonIntegerIndex'); % Suppress integer operand errors because it's ok to round for the window size

% Read Audio
fs = 44100;         % sample frequency (Hz)
full = audioread('O.mp3');

% Remove leading 0's and select range
for i = 1:fs
    if full(i) ~= 0
        crop = i;
        break
    end
end
full = full(crop:end);

startTime = 0;
endTime = length(full)/fs;

% Play song
tic
player = audioplayer(full(fs*startTime+1:fs*endTime), fs);
player.play();
initialTime = toc;

% Perform fft and get frequencies (hopefully in realish time with audio)
windowSize = fs/8;
for i = windowSize/2+1+fs*startTime : fs/16 : fs*endTime
    beginningChunk = round(i-windowSize/2);
    endChunk = round(i+windowSize/2);

    x = full(beginningChunk:endChunk);
    y = fft(x);
    n = length(x);          % number of samples in chunk
    power = abs(y).^2/n;    % power of the DFT
    power = power(1:end/2);
    f = (0:n-1)*(fs/n);     % frequency range
    f = f(1:end/2);
    while initialTime+i/fs > toc
        pause(.0001);
    end
    figure(1);
    plot(f,power);
    axis([0 10000 0 5]);
    xlabel('Frequency');
    ylabel('Power');
end

Upvotes: 1

Views: 970

Answers (1)

Brendan Frick
Brendan Frick

Reputation: 1025

Represent power as a z-score

Prepare [t,f] samples for baseline

winrange = windowSize/2+1+fs*startTime : fs/16 : fs*endTime;
for i = winrange
   ...
   % base_pow is [# of time bins,# freq bins] 
   base_pow(i == winrange,:) = power(1:end/2);
   ...
end

Normalize each sample by baseline data

for i = winrange
   ...
   raw_pow = power(1:end/2);
   % collapse across 1st dimension of base_pow
   norm_pow = (raw_pow - mean(base_pow,1))./std(base_pow,[],1)
   ...
end

Signal streaming

The above solution requires you to use the full sample. A more computationally efficient method would be to fit some function to several audio tracks before streaming any sample and use that curve for normalization.

Upvotes: 2

Related Questions