Bily
Bily

Reputation: 801

Does CMU sphinx4 support non-english speech recognition

I know that sphinx 3(now it's called Pocketsphinx) support non-English language speech recognition like German, Spanish and Chinese. But does sphinx 4 support those languages too?

To do speech recognition,three files are needed: acoustic model file, language model file and dictionary file. But sphinx 4 can only read ASCII encoded file while some non-English language dictionary and language model are encoded by UTF-8.

It seems CMU sphinx 4 can only support ASCII encoded language by default. Is it true?

Any help will be appreciated!!!

Upvotes: 1

Views: 890

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

It seems CMU sphinx 4 can only support ASCII encoded language by default. Is it true?

sphinx4 supports utf-8 encoded files. To make sure that java uses utf-8 for input-output you can add an option to java command line (or to JVM in your IDE):

   -Dfile.encoding=utf-8

Upvotes: 1

Related Questions