Reputation: 2674
I'm trying to compute a MFCC algorithm based upon this paper I found (http://arxiv.org/pdf/1003.4083.pdf) so what I have done so far is:
step 1) Pre–emphasis
step 2) Framing
step 3) Hamming windowing
step 4) Fast Fourier Transform
step 5) Mel Filter Bank Processing
step 6) : Discrete Cosine Transform
Basically, I took the Mel Bank filters and multiplied them the actual raw signal. I then performed the FFT on these results this looks like this:
FFT on Frame 1:
And then I computed the DCT of the FFT, which results look like this:
DCT on Frame 1:
Does this look correct so far? Is there even a way for me to check this, so that I know that I am going in the right direction?
Also, I need to get 13 Coefficients but I do not know how to determine which of these to get. I get 256 values, so do I take the first 13 of them? Or, do I get the total energy?
I hope someone can help me.
Upvotes: 1
Views: 12126
Reputation: 6351
I'm confused to what you just wrote. The only thing I need to know is I have split the signal into frames, n = 100, m = 256 (I believe) which produces around 390 blocks, so, is there 13 coefficients for each of the blocks OR just 13 for the entire sound fle?
the answer is that there are 13 coefficients for each block, not for entire sound file.
and your way to calculate mfcc coefficients are wrong, you should follow the 1-6 steps you mentioned.
step 1) Pre–emphasis for the entire sound file.
step 2) Framing the entire sound file to get many blocks
step 3) Hamming windowing for each block
step 4) Fast Fourier Transform for each block
step 5) Mel Filter Bank Processing for each block
step 6) : Discrete Cosine Transform for each block
Upvotes: 4
Reputation: 1377
After days of search for something similar, I stumbled upon a very usefull tutorial of how to get the MFC Coeficients: Mel Frequency Cepstral Coefficient (MFCC) tutorial
(although the thread is old, I hope the answer might help future readers)
Upvotes: 9
Reputation: 25220
No, you are wrong.
You need to compute logarithm of the mel filter bank energies after FFT and only then apply DCT. The number of energies of filterbanks should be about 20 or 40, after DCT you should get 20 or 40 numbers and take first 13.
What you did with FFT is all wrong.
You might want to read some MFCC code instead of doing something from scratch, there are many implementations out there, for example in sphinxbase:
http://cmusphinx.sourceforge.net
Upvotes: 2