Reputation: 2112
I am developing an application to collect data in the field on Android devices using speech recognition. There are five "target words", as well as several numbers (zero, one, ten, one-hundred, etc) that are recognized.
I have improved accuracy of the target words by adding homonyms (homophones) as well as vernacular synonyms. Target words are Chinook, sockeye, coho, pink, and chum. This is the relevant code,
public void parseWords() {
List<String> szlNumbers = Arrays.asList(new String[]{"ONE", "TEN", "ONE HUNDRED", "ONE THOUSAND", "TEN THOUSAND"});
//species with phonemes and vernacular names
List<String> szlChinook = Arrays.asList("CHINOOK", "CHINOOK SALMON", "KING", "KINGS", "KING SALMON", "KING SALMAN");
List<String> szlSockeye = Arrays.asList("SOCKEYE", "SOCCER", "SOCKEYE SALMON", "SOCK ICE", "SOCCER ICE", "SOCK I SAID", "SOCCER IS", "OKAY SALMON", "RED SALMON", "READ SALMON", "RED", "REDS");
List<String> szlCoho = Arrays.asList("COHO", "COHO SALMON", "COVER SALMON", "SILVER SALMON", "SILVER", "SILVERS", "CO", "KOBO", "GO HOME", "COMO", "COVER", "GO");
List<String> szlPink = Arrays.asList("PINK", "A PINK", "PINKS", "PINK SALMON", "HANK SALMON", "EXAMINE", "HUMPY", "HOBBY", "HUMPIES", "HUM BE", "HUM P", "BE", "HUMPTY", "HOBBIES", "HUMVEE", "THE HUMVEES", "POMPEY");
List<String> szlChum = Arrays.asList("CHUM", "JOHN", "JUMP", "SHARMA", "CHARM", "COME", "CHARM SALMON", "COME SALMON", "CHUM SALMON", "JUMP SALMON", "TRUMP SALMON", "KETA SALMON", "KETA", "DOG", "DOGS", "DOG SALMON", "GATOR", "GATORS", "CALICO", "A CALICO");
//Collections.sort(szlChinook); //what is this?
szVoskOutput=szVoskOutput.toUpperCase();
if (szVoskOutput.compareTo("")==0){
//do nothing, this is a blank string
return;
}
if(szVoskOutput==null){//...and this is a null string
return;
}
//pink
if (szlPink.contains(szVoskOutput)) {
szSpecies = "Pink";
populateSpecies();
return;
}
//chum
if (szlChum.contains(szVoskOutput)) {
szSpecies = "Chum";
populateSpecies();
return;
}
//sockeye
if (szlSockeye.contains(szVoskOutput)) {
szSpecies = "Sockeye";
populateSpecies();
return;
}
//coho
if (szlCoho.contains(szVoskOutput)) {
szSpecies = "Coho";
populateSpecies();
return;
}
//Chinook
if (szlChinook.contains(szVoskOutput)) {
szSpecies = "Chinook";
populateSpecies();
return;
}
if(szlNumbers.contains(szVoskOutput)) {//then this is a number, put in count txt box
tvCount.setText(szVoskOutput);
return;
}else{
Toast.makeText(this, "Please repeat clearly. Captured string is:" + szVoskOutput, Toast.LENGTH_SHORT).show();
}
}//end parseWords()
I have a streamlined version of the application with source code on GitHub: https://github.com/portsample/salmonTalkerLite as well as the latest full version on Google Play: https://play.google.com/store/apps/details?id=net.blepsias.salmontalker
Using the target word and homonyms, I can get hits in four to five seconds. I would like to make this faster. What can I do to further tune for speed?
Upvotes: 0
Views: 3093
Reputation: 2112
This helped out significantly. Recognition time is now consistently about 1.5 seconds.
private void recognizeMicrophone() {
if (speechService != null) {
setUiState(iSTATE_DONE);
speechService.stop();
speechService = null;
} else {
setUiState(iSTATE_MIC);
try {
Recognizer rec = new Recognizer(model, 16000.f, "[\"sockeye pink coho chum chinook atlantic salmon\","[unk]"]");
speechService = new SpeechService(rec, 16000.0f);
speechService.startListening(this);
} catch (IOException e) {
setErrorState(e.getMessage());
}
}
}
This clears out the upstream extraineous Vosk output leaving only specified target words. This will eliminate the need for the elaborate homonym sorting conditionals shown in the original post. Thanks to Nickolay Shmyrev for this. I am still looking for other methods to speed recognition up, or otherwise improve this process.
Updates and improvements will be reflected in source code on GitHub: https://github.com/portsample/salmonTalkerLite
Upvotes: 0