Reputation: 11
I have a task with speaker verification.
My task is calculate the similarity between two audio speech voice, then compare with a threshold. Ex: similarity score between two audio is 70%, threshold is 50%. Hence the speaker is the same person.
The speech is text-independent, it's can be any conversation.
I have experiment in using MFCC, GMM for speaker recognition task, but this task is difference, just compare two audio feature to have the similarity score. I don't know which feature is good for speaker verification and which algorithm can help me to calculate similarity score between 2 patterns.
Hope to have you guys's advices,
Many thanks.
Upvotes: 1
Views: 418
Reputation: 1
I am also working on TIMIT Dataset for speaker verification. I have extracted mfcc features and trained a UBM for same, and adapted for each speaker.When it comes to adaptation I have used diagonal matrix. How are you testing the wav files? However, when it comes to features you can use pitch and energy.
Upvotes: 0
Reputation: 25220
State of the art these days is xvectors:
Deep Neural Network Embeddings for Text-Independent Speaker Verification
Implementation in Kaldi is here.
Upvotes: 1