user2482542
user2482542

Reputation: 361

Plotting Harmonic Product Spectrum using MATLAB

I used Harmonic Product Spectrum to find the fundamental note present when a number of harmonics are present. This is the code I've implemented;

[song,FS] = wavread('C major.wav');
%sound(song,FS);

P = 20000;
N=length(song);                     % length of song
t=0:1/FS:(N-1)/FS;                  % define time period

song = sum(song,2);
song=abs(song);

%----------------------Finding the envelope of the signal-----------------%
% Gaussian Filter
w = linspace( -1, 1, P);                      % create a vector of P values between -1 and 1 inclusive
sigma = 0.335;                                % standard deviation used in Gaussian formula
myFilter = -w .* exp( -(w.^2)/(2*sigma.^2));  % compute first derivative, but leave constants out
myFilter = myFilter / sum( abs( myFilter ) ); % normalize

% fft convolution
myFilter = myFilter(:);                         % create a column vector
song(length(song)+length(myFilter)-1) = 0;      %zero pad song
myFilter(length(song)) = 0;                     %zero pad myFilter

edges =ifft(fft(song).*fft(myFilter));
tedges=edges(P:N+P-1);                      % shift by P/2 so peaks line up w/ edges
tedges=tedges/max(abs(tedges));                 % normalize

%---------------------------Onset Detection-------------------------------%
% Finding peaks
maxtab = [];
mintab = [];
x = (1:length(tedges));
min1 = Inf;
max1 = -Inf;
min_pos = NaN; 
max_pos = NaN;

lookformax = 1;
for i=1:length(tedges)

    peak = tedges(i:i);
  if peak > max1, 
      max1 = peak;
      max_pos = x(i); 
  end
  if peak < min1, 
      min1 = peak;
      min_pos = x(i); 
  end

  if lookformax
    if peak < max1-0.07
      maxtab = [maxtab ; max_pos max1];
      min1 = peak; 
      min_pos = x(i);
      lookformax = 0;
    end  
  else
    if peak > min1+0.08
      mintab = [mintab ; min_pos min1];
      max1 = peak; 
      max_pos = x(i);
      lookformax = 1;
    end
  end
end


max_col = maxtab(:,1);
peaks_det = max_col/FS;
No_of_peaks = length(peaks_det);

[song,FS] = wavread('C major.wav');
song = sum(song,2);

%---------------------------Performing STFT--------------------------------%
h = 1;
%for i = 2:No_of_peaks

    song_seg = song(max_col(7-1):max_col(7)-1);
    L = length(song_seg); 
    NFFT = 2^nextpow2(L); % Next power of 2 from length of y
    seg_fft = fft(song_seg,NFFT);%/L;

    f = FS/2*linspace(0,1,NFFT/2+1);
    seg_fft_2 = 2*abs(seg_fft(1:NFFT/2+1));
    L5 = length(song_seg);

    figure(6)
    plot(f,seg_fft_2)
    %plot(1:L/2,seg_fft(1:L/2))
    title('Frequency spectrum of signal (seg_fft)')
    xlabel('Frequency (Hz)')
    xlim([0 2500])
    ylabel('|Y(f)|')
    ylim([0 500])

%----------------Performing Harmonic Product Spectrum---------------------%

   % In harmonic prodcut spectrum, you downsample the fft data several times and multiply all those with the original fft data to get the maximum peak. 
    %HPS
    seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 
    seg_fft = abs(seg_fft);
    a = length(seg_fft);


    seg_fft2 = ones(size(seg_fft));
    seg_fft3 = ones(size(seg_fft));
    seg_fft4 = ones(size(seg_fft));
    seg_fft5 = ones(size(seg_fft));



    for i = 1:((length(seg_fft)-1)/2)
        seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
    end

     %b= size(seg_fft2)

    L1 = length(seg_fft2); 
    NFFT1 = 2^nextpow2(L1); % Next power of 2 from length of y


    f1 = FS/2*linspace(0,1,NFFT1/2+1);
    seg_fft12 = 2*abs(seg_fft2(1:NFFT1/2+1));

    figure(7);
    plot(f1,seg_fft12)
    title('Frequency spectrum of signal (seg_fft2)')
    xlabel('Frequency (Hz)')
    xlim([0 2500])
    ylabel('|Y(f)|')
    ylim([0 500])

This is the plot for Figure 6enter image description here

So in the real scenario once I perform HPS (downsample by 2) the peak at 440.1 should shift down to 220 while the peak at 881 should shift down to around 440. But when I plot the graph that is not what I get. Insetad this is the graph I get, enter image description here

Why don't I get the correct graph???? I don't seem to understand what Im doing wrong here... Can someone please have a look and let me know.. Thank you.....

Upvotes: 0

Views: 2666

Answers (1)

Floris
Floris

Reputation: 46375

The problem with your downsampling is that you trim the vector by 2x before doing the downsampling, instead of after. You do

seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 

% [... other stuff ...]
for i = 1:((length(seg_fft)-1)/2)
    seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end

Instead you need to downsample first, then trim:

for i = 1:((length(seg_fft)-1)/2)
    seg_fft2(i,1) = seg_fft(2*i,1);%(seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end

seg_fft = seg_fft(1 : size(seg_fft,1)/2 ); 

UPDATE you asked why this does not preserve the peaks. The short answer is that you may not be "looking at" the peaks. If you want to keep (nearest) peaks during downsampling by n, you can do the following:

n = 3; % degree of decimation or downsampling we want to do
N = size(seg_fft, 1); % number of samples in original FFT
Nn = n * floor(N/n);  % number of samples that can be divided by n

fftBlock = reshape(seg_fft(1:Nn, 1), n, N);
fftResampled = max(fftBlock);

How does this work? Let's use a simple example of 10 x 1 points:

seg_fft = [0 1 10 5 4 3 6 12 4 3];

We want "every 3rd". Naive algorithm would give

fftResampled = [2 3 7];

But we would have liked the "peaks" [10 3 12] - unfortunately they are not in the right location.

After reshaping the array (and losing the last element; if it might be an interesting value we could append and pad with zeros) we get:

fftBlock = [0  5  6;
            1  4  3;
           10  3  4];

(Remember that Matlab matrices are rows-first)

Now taking the max (the function will operate along the first dimension unless we tell it otherwise) you get

fftResampled = [10 5 6];

i.e. always the peak. While this preserves the peaks, it does mean that your "valleys" are getting filled in a bit.

Bottom line: there is no way not to destroy "some" information in the process of downsampling - you are, after all, throwing half the samples away. What you keep, and how you account for the information content in the data you discard, is something only you can decide, as it will depend on your application, and what is important for you.

Upvotes: 1

Related Questions