Mughees Ismail
Mughees Ismail

Reputation: 21

How to make sound signal length the same in MATLAB?

I found this speech recognition code that I downloaded from a blog. It works fine, it asks to record sounds to create a dataset and then you have to call a function to train the system using neural networks.

I want to use this code to train using my dataset of 20 words that I want to recognise.

Problem: I have a dataset of 800 files for twenty words i.e. 40 recordings from different people for each word. I used Windows sound recorder to collect the files. The problem is that in the code is that the size of the input file is set to ALWAYS be 8000, my dataset on the other hand is not constant, some files are 2 seconds long, some are 3 that means there'll be different number of samples in each file.

If the samples per input signal variate it'll probably generate errors. I want to use my files to train the system. How do I do that?

Code:

clc;clear all;
load('voicetrainfinal.mat');
Fs=8000;
for l=1:20
 clear y1 y2 y3;
display('record voice');
pause();
x=wavrecord(Fs,Fs);     % wavrecord(n,Fs) records n samples at a sampling rate of Fs
maxval = max(x);
if maxval<0.04
    display('Threshold value is too large!');
end
t=0.04;
j=1;
for i=1:8000
    if(abs(x(i))>t)
        y1(j)=x(i);
        j=j+1;
    end
end
y2=y1/(max(abs(y1)));
y3=[y2,zeros(1,3120-length(y2))];
y=filter([1 -0.9],1,y3');%high pass filter to boost the high frequency components
%%frame blocking
blocklen=240;%30ms block
overlap=80;
block(1,:)=y(1:240);
for i=1:18
    block(i+1,:)=y(i*160:(i*160+blocklen-1));
end
w=hamming(blocklen);
for i=1:19
    a=xcorr((block(i,:).*w'),12);%finding auto correlation from lag -12 to 12
    for j=1:12
        auto(j,:)=fliplr(a(j+1:j+12));%forming autocorrelation matrix from lag 0 to 11
    end
    z=fliplr(a(1:12));%forming a column matrix of autocorrelations for lags 1 to 12
    alpha=pinv(auto)*z';
    lpc(:,i)=alpha;
end
wavplay(x,Fs);
X1=reshape(lpc,1,228);
a1=sigmoid(Theta1*[1;X1']);
    h=sigmoid(Theta2*[1;a1]);
    m=max(h);
  p1=find(h==m);
  if(p1==10)
      P=0
  else
      P=p1
  end
end

Upvotes: 1

Views: 842

Answers (2)

Bilal Maqsood
Bilal Maqsood

Reputation: 11

n=noOfFiles
for k=1:n
M(k,1:length(filedata{k})) = filedata{k}
end

:P

Upvotes: 1

vrleboss
vrleboss

Reputation: 471

In your code you have:

Fs=8000;
wavrecord(n,Fs) % records n samples at a sampling rate Fs
for i=1:8000
  if(abs(x(i))>t)
      y1(j)=x(i);
      j=j+1;
  end
end

It seems that instead of recording you are going to import your sound file (here for a .wave file):

[y, Fs] = wavread(filename);

Instead of hardcoding the 8000value you can read the length of your file:

n = length(y);

and then just use that n variable in the for loop:

for i=1:n
  if(abs(x(i))>t)
      y1(j)=x(i);
      j=j+1;
  end
end

The rest of the code seems to be independent of that 8000 value. If you are worried that having non-constant file length. Compute n_max, the maximum length of all the audio recordings you have. And for recording shorter than n_max samples pad them with zeros so as to make them all n_max long.

n_max = 0;
for file = ["file1" "file2" ... "filen"]
  [y, Fs] = wavread(filename);
  n_max = max(n_max,length(y));
end

Then each time you process a sound vector you can pad it with 0 (harmless for you, because 0 means no sound) like so:

y = [y, zeros(1, n_max - length(y))];

Upvotes: 1

Related Questions