user2674107
user2674107

Reputation: 1

How to use other language with tesseract on C# Tesseract 3.02 + Emgu 2.4.9

My 'tessdate' contain: eng.traineddata, eng.cube.bigrams, eng.cube.fold, eng.cube.lm, eng.cube.nn, eng.cube.params eng.cube.size, eng.cube.word-freq, eng.tesseract_cube.nn

rus.traineddata, rus.cube.fold, rus.cube.lm, rus.cube.nn, rus.cube.params, rus.cube.size, rus.cube.word-freq

I haven't got 'rus.cube.bigrams' and 'rus.tesseract_cube.nn' files in the tessdata dirrectory.

I fetch this mistake "Unable to create ocr model using Path 'tessdata' and language 'rus'", when I change 'eng' to 'rus' or 'ita' for example in this code:

private Tesseract _ocr;

  public LicensePlateDetector(String dataPath)
  {
     //create OCR engine
      _ocr = new Tesseract("tessdata", "rus", Tesseract.OcrEngineMode.OEM_CUBE_ONLY);
     _ocr.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ-1234567890");
  }

Where Am I mistaking?

Upvotes: 0

Views: 4143

Answers (1)

Oleksii Aza
Oleksii Aza

Reputation: 5398

It says that it can't find rus language resources in tessdata folder. Check if you have set Copy to Output Directory for rus files to Copy always. Also I've just tried to use Tesseract .NET wrapper. It has more pleasent syntax:

using (var engine = new TesseractEngine(pathToLangFolder, "rus", EngineMode.Default))
{
    // have to load Pix via a bitmap since Pix doesn't support loading a stream.
    using (var image = new Bitmap(fileName))
    {
        using (var pix = PixConverter.ToPix(image))
        {
            using (var page = engine.Process(pix))
            {
                Console.WriteLine(page.GetMeanConfidence() + " : " + page.GetText());
            }
        }
    }
}

Upvotes: 1

Related Questions