Wso
Wso

Reputation: 302

How to change python pocket sphinx dictionary

I am trying to make pocketsphinxs live speech recognition more accurate, given that I will only say a couple select words. I searched online and it seems that I should be able to create my own dictionary, using the tool at this website: http://www.speech.cs.cmu.edu/tools/lmtool-new.html This seemed to have worked, however I cannot find what to do with the files after I have created them. From the python pocket sphinx website: https://pypi.python.org/pypi/pocketsphinx it seems I should be able to set new dictionaries for the live speech recognizer like this:

import os
from pocketsphinx import LiveSpeech, get_model_path

model_path = get_model_path()

speech = LiveSpeech(
    verbose=False,
    sampling_rate=16000,
    buffer_size=2048,
    no_search=False,
    full_utt=False,
    hmm=os.path.join(model_path, 'en-us'),
    lm=os.path.join(model_path, 'en-us.lm.bin'),
    dic=os.path.join(model_path, 'cmudict-en-us.dict')
)

for phrase in speech:
    print(phrase)

However I am unclear as to what exactly to change in this code to input my own dictionary data. I have tried changing the model_path for the dictionary to the path to the dictionary I downloaded from the website, but that gave an error:

RuntimeError: new_Decoder returned -1

What do I need to change in this code to get pocketSphinx to use my dictionary?

Upvotes: 0

Views: 3159

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

You should get something like 8569.lm - the language model and 8659.dic - the dictionary. You put them in filesystem and use like this:

import os
from pocketsphinx import LiveSpeech, get_model_path

model_path = get_model_path()

speech = LiveSpeech(
    sampling_rate=16000,
    hmm=os.path.join(model_path, 'en-us'),
    lm='/home/user/8569.lm',
    dic='/home/user/8569.dic'
)

for phrase in speech:
    print(phrase)

You have to properly specify the filesystem path to the files and it will work.

Upvotes: 2

Related Questions