Reputation:
I am writing a program that is supposed to visually display the frequencies that make up audio. To do this I have a rolling window that I perform an FFT on (I'll post code at bottom). What interested me is that all my plots have a high power near the 0hz side and a low power near the 10khz side:
I want to change the graph so that instead of showing the actual power it shows the power relative to the power at the same frequency to the rest of the song. For example, I want to have it so that the maximum value at the 0hz is the same as the maximum value at the 10khz. This would mean reducing the y value at low frequencies and raising it at high frequencies. How would I go about doing this and make it look like the graph isn't on a downward slope?
As a side note after I get the algorithm working I am going to stream the audio rather than reading it from a audio file so that may eliminate the possibility of keeping an average of each frequency throughout the entire song.
% Prototype to graph moving window of FFT through an audio file in real time
clear all;
warning('off','MATLAB:colon:nonIntegerIndex'); % Suppress integer operand errors because it's ok to round for the window size
% Read Audio
fs = 44100; % sample frequency (Hz)
full = audioread('O.mp3');
% Remove leading 0's and select range
for i = 1:fs
if full(i) ~= 0
crop = i;
break
end
end
full = full(crop:end);
startTime = 0;
endTime = length(full)/fs;
% Play song
tic
player = audioplayer(full(fs*startTime+1:fs*endTime), fs);
player.play();
initialTime = toc;
% Perform fft and get frequencies (hopefully in realish time with audio)
windowSize = fs/8;
for i = windowSize/2+1+fs*startTime : fs/16 : fs*endTime
beginningChunk = round(i-windowSize/2);
endChunk = round(i+windowSize/2);
x = full(beginningChunk:endChunk);
y = fft(x);
n = length(x); % number of samples in chunk
power = abs(y).^2/n; % power of the DFT
power = power(1:end/2);
f = (0:n-1)*(fs/n); % frequency range
f = f(1:end/2);
while initialTime+i/fs > toc
pause(.0001);
end
figure(1);
plot(f,power);
axis([0 10000 0 5]);
xlabel('Frequency');
ylabel('Power');
end
Upvotes: 1
Views: 970
Reputation: 1025
Prepare [t,f]
samples for baseline
winrange = windowSize/2+1+fs*startTime : fs/16 : fs*endTime;
for i = winrange
...
% base_pow is [# of time bins,# freq bins]
base_pow(i == winrange,:) = power(1:end/2);
...
end
Normalize each sample by baseline data
for i = winrange
...
raw_pow = power(1:end/2);
% collapse across 1st dimension of base_pow
norm_pow = (raw_pow - mean(base_pow,1))./std(base_pow,[],1)
...
end
Signal streaming
The above solution requires you to use the full sample. A more computationally efficient method would be to fit some function to several audio tracks before streaming any sample and use that curve for normalization.
Upvotes: 2