How can I configure Spanish in pocketsphinx with python?

Question

I'm trying to work with Pocketsphinx for Speech Recognition with Ubuntu 32b and python 2.7

I'm spanish native and I want to use an spanish model, but it is difficult because of limited information and my little knowledge in this specific area. It has been hard to find an easy source for installation steps.

Nikolay Shmyrev · Accepted Answer

Record a sample file hola.wav with the format 16khz 16bit mono.

Then install pocketsphinx-python

sudo apt-get install -y python python-dev python-pip build-essential swig git
git clone --recursive https://github.com/cmusphinx/pocketsphinx-python
cd pocketsphinx-python
sudo python setup.py install

Then download Spanish models from cmusphinx website.

Then write a script and try to run it, it should look like this:

#!/usr/bin/env python
from os import environ, path

from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

# Here is the configuration for Spanish
config = Decoder.default_config()
config.set_string('-hmm', 'cmusphinx-es-5.2/model_parameters/voxforge_es_sphinx.cd_ptm_4000')
config.set_string('-lm', 'es-20k.lm.gz')
config.set_string('-dict', 'es.dict')
decoder = Decoder(config)

# Decode streaming data.
decoder = Decoder(config)
decoder.start_utt()
stream = open('hola.wav', 'rb')
while True:
  buf = stream.read(1024)
  if buf:
    decoder.process_raw(buf, False, False)
  else:
    break
decoder.end_utt()
print ('Best hypothesis segments: ', [seg.word for seg in decoder.seg()])

To learn more about CMUSphinx read the tutorial.

How can I configure Spanish in pocketsphinx with python?

Answers (1)

Related Questions