portsample
portsample

Reputation: 2112

Speech-to-text number capture

Is there a method for capturing spoken numbers via using cmusphinx?

Poking around in the cmudict-en-us.dict file I find the following,

forty F AO R T IY
forty-five F AO R T IY F AY V
fifty F IH F T IY
eighty EY T IY

Rather than having Sphinx respond with "forty, forty-five, fifty, eighty" is it possible to create a dictionary like,

40 F AO R T IY
45 F AO R T IY F AY V
50 F IH F T IY
80 EY T IY

so than arabic numerals are returned...ie 40,45,50,80? Is there such a dictionary already? Thanks.

Upvotes: 2

Views: 114

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

It is possible to create a dictionary like this, but not really recommended. You'd better recognize numbers as words and then create post-processing code to turn them into actual numbers. The reason is that user can spell the number in various ways like this:

  • eight seven
  • eighty seven
  • a hundred and thirty five
  • one three five
  • one thirty [big pause] five

There are too many variants to handle them in recognizer. Once you recognized the string you can use something like Duckling to convert it to action. If Duckling is too complex for you, you can use simple regexes or python code like here Is there a way to convert number words to Integers?

Upvotes: 1

Related Questions