Reputation: 790
I have MFCC (Mel-frequency cepstral coefficient) files generated by HTK from .wav files. What I need is to extract a time span from the MFCC. When the MFCC file represents audio of 90 minutes length, then I want to get e.g. MFCC for the third minute of the audio.
The HTK book says the MFCC file consists of a header and a contiguous sequence of samples. But determining the exact size of a sample in bytes doesn't seem trivial.
Is there perhaps a parser for the files? (Of course there is, in HTK, but I didn't manage to figure out how to use the binaries for this task.)
Or is there maybe an easy way to determine the size of a sample and of the header, as to be able to simply cut the file apart?
Upvotes: 2
Views: 1865
Reputation: 790
Figured it out. HTK has a tool for that. HCopy can convert MFCC to MFCC and accepts parameters for start and end.
HCopy -C config0 -s 10e7 -e 11e7 source.mfcc target.mfcc
cuts 00:10 .. 00:11 from source.
config0 should contain the same configuration that was used for creating the original mfcc's from wav, except for the sourcekind set to wav.
Upvotes: 2