Improving accuracy of speech recognition using Vosk (Kaldi) running on Android

Question

I am developing an application to collect data in the field on Android devices using speech recognition. There are five "target words", as well as several numbers (zero, one, ten, one-hundred, etc) that are recognized.

I have improved accuracy of the target words by adding homonyms (homophones) as well as vernacular synonyms. Target words are Chinook, sockeye, coho, pink, and chum. This is the relevant code,

 public void parseWords() {
    List szlNumbers = Arrays.asList(new String[]{"ONE", "TEN", "ONE HUNDRED", "ONE THOUSAND", "TEN THOUSAND"});
    //species with phonemes and vernacular names
    List szlChinook = Arrays.asList("CHINOOK", "CHINOOK SALMON", "KING", "KINGS", "KING SALMON", "KING SALMAN");
    List szlSockeye = Arrays.asList("SOCKEYE", "SOCCER", "SOCKEYE SALMON", "SOCK ICE", "SOCCER ICE", "SOCK I SAID", "SOCCER IS", "OKAY SALMON", "RED SALMON", "READ SALMON", "RED", "REDS");
    List szlCoho = Arrays.asList("COHO", "COHO SALMON", "COVER SALMON", "SILVER SALMON", "SILVER", "SILVERS", "CO", "KOBO", "GO HOME", "COMO", "COVER", "GO");
    List szlPink = Arrays.asList("PINK", "A PINK", "PINKS", "PINK SALMON", "HANK SALMON", "EXAMINE", "HUMPY", "HOBBY", "HUMPIES", "HUM BE", "HUM P", "BE", "HUMPTY", "HOBBIES", "HUMVEE", "THE HUMVEES", "POMPEY");
    List szlChum = Arrays.asList("CHUM", "JOHN", "JUMP", "SHARMA", "CHARM", "COME", "CHARM SALMON", "COME SALMON", "CHUM SALMON", "JUMP SALMON", "TRUMP SALMON", "KETA SALMON", "KETA", "DOG", "DOGS", "DOG SALMON", "GATOR", "GATORS", "CALICO", "A CALICO");

    //Collections.sort(szlChinook); //what is this?
    szVoskOutput=szVoskOutput.toUpperCase();

    if (szVoskOutput.compareTo("")==0){
        //do nothing, this is a blank string
        return;
    }
    if(szVoskOutput==null){//...and this is a null string
        return;
    }
    //pink
    if (szlPink.contains(szVoskOutput)) {
        szSpecies = "Pink";
        populateSpecies();
        return;
    }
    //chum
    if (szlChum.contains(szVoskOutput)) {
        szSpecies = "Chum";
        populateSpecies();
        return;
    }
    //sockeye
    if (szlSockeye.contains(szVoskOutput)) {
        szSpecies = "Sockeye";
        populateSpecies();
        return;
    }
    //coho
    if (szlCoho.contains(szVoskOutput)) {
        szSpecies = "Coho";
        populateSpecies();
        return;
    }
    //Chinook
    if (szlChinook.contains(szVoskOutput)) {
        szSpecies = "Chinook";
        populateSpecies();
        return;
    }
    if(szlNumbers.contains(szVoskOutput)) {//then this is a number, put in count txt box
        tvCount.setText(szVoskOutput);
       return;
    }else{
            Toast.makeText(this, "Please repeat clearly. Captured string is:" + szVoskOutput, Toast.LENGTH_SHORT).show();
    }
}//end parseWords()

I have a streamlined version of the application with source code on GitHub: https://github.com/portsample/salmonTalkerLite as well as the latest full version on Google Play: https://play.google.com/store/apps/details?id=net.blepsias.salmontalker

Using the target word and homonyms, I can get hits in four to five seconds. I would like to make this faster. What can I do to further tune for speed?

Improving accuracy of speech recognition using Vosk (Kaldi) running on Android

Answers (1)

Related Questions