Adnan Ali
Adnan Ali

Reputation: 3055

Small Data training in CMU Sphinx

I have installed sphinxbase, sphinxtrain and pocketsphinx in Linux (Ubuntu). Now I am trying to train data with speechcorps,transcriptions, dictionary etc obtained from VOXFORGE. (My etc and wav folder's data is obtained from VOXFORGE)

As I am new so I just want to train data and get some results with few line of transcripts and few wav files. let say 10 wav file and 10 transcript lines cosponsoring to it. Like this person in doing in this video but when I run sphinxtrain then I am getting error.

Estimated Total Hours Training: 0.07021431623931
    This is a small amount of data, no comment at this time

enter image description here

If I do CFG_CD_TRAIN= no I dont know what it means.

What changes I need to make? So I am able to remove this error.

PS: I can not add more data because I want to see some results first for my better understanding the whole scenario.

Upvotes: 1

Views: 634

Answers (1)

Dariusz
Dariusz

Reputation: 22271

Not enough data for the training, we can only train CI models

You need at least 30 minutes of audio data to train CI models. Alternatively, you can set CFG_CD_TRAIN to "no".

Upvotes: 2

Related Questions