Kunal
Kunal

Reputation: 51

Pocketsphinx setKeywordThreshold() issue

I am thinking to use pocketsphinx offline speech recognition for my app but its documentation is not clear. If anybody can give answers of following question then it will really help me a lot.

  1. What is the role (use) of setKeywordThreshold(1e-5f) method. What is minimum and maximum value allowed in this method.

  2. I want to give support for different languages and find in built acoustic models for some languages on this link http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/. but i cant understand which model will be best for which language because of lag of documentation. Can anybody please suggest me best in-build acoustic models for following languages -

    (a). Australian English (b). American English (c). British English (d). Canadian English (e). European English (f). Indian English (g). Irish English (h). New Zealand English (i). South African English (j). Russian (k). Spanish (l). French (m). Dutch (n). German

  3. I just want to recognize numbers from 1 to 200 in each language. What is the best way to do this ?

  4. I created a digits.gram file to recognize digits from 1 to 99 but it recognize background voice also. For example, When any background voice of drill machine occur then it recognize it as one. How could we recognize digits only when that particular digits is spoken ?

digits.gram file

#JSGF V1.0;

grammar digits;

<single> = one | two | three | four | five | six | seven | eight | nine ;
<digit> = <single> |
          zero  |
          ten   |
          eleven |
          twelve |
          thirteen |
          fourteen |
          fifteen |
          sixteen |
          seventeen |
          eighteen |
          nineteen |
          twenty |
          thirty |
          forty |
          fifty |
          sixty |
          seventy |
          eighty |
          ninety |
          twenty <single> |
          thirty <single> |
          forty <single> |
          fifty <single> |
          sixty <single> |
          seventy <single> |
          eighty <single> |
          ninety <single> ;

Upvotes: 4

Views: 843

Answers (1)

Ievgen
Ievgen

Reputation: 4443

The best way to solve problem 4 is to add a keyword to start the recognition. When you have a keyword than you can suggest that user knows how to use your system and will say "hello, Pocketsphinx" before the real command.

So can try:

  • Use a keyword.
  • Filter the output by a confidence that should be returned by a decoder.
  • Also you can add few more common words as fallback to your dictionary so Pocketsphinx will match them instead your "correct" list, maybe this will increase accuracy. (but it can be even worth, you should play with it to find the best way to solve your scenario)

Upvotes: 0

Related Questions