Reputation: 161

How to implement multi languages models in VOSK?

I wondered how we can implement multi-language processing in an application with the Vosk library. I want to make an application that supports multi-languages like Persian, Kurdish, and English. The programming language that I want to use is Java with Spring framework. I know we can implement a speaking language with Model model = new Model("path to model") but how we can do it for multiple models?

Upvotes: 3

Answers (1)

Erik Hermansen

Reputation: 2369

How about creating and running two or more recognizers? (One for each language you want to detect.)

Pass the same audio buffer to each recognizer via AcceptWaveform. Your application logic can receive results from both recognizers. I imagine you'll occasionally have cross-language homonyms (e.g. English "nine" and German "nein") to deal with where you want to ignore one match and use the other. But maybe the heuristics needed to pick one won't be hard for your app.

Clearly, running multiple recognizers would be inefficient in terms of CPU/memory usage, but maybe it's acceptable for your purposes. A further improvement might be to turn off recognizers that aren't needed after you have detected enough speech in one language to predict the speaker will continue in that language.

If Vosk/Kaldi isn't thread-safe for multiple recognizer instances in one process, you could run multiple processes to isolate the recognizers with some kind of inter-process communication to manage the recognizers.

Upvotes: 3

How to implement multi languages models in VOSK?

Answers (1)

Related Questions