Reputation: 45
My code
private void button1_Click(object sender, EventArgs e)
{
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
textBox1.Clear();
var img = new Bitmap(openFileDialog1.FileName);
//var ocr = new TesseractEngine("./tessdata", "eng", EngineMode.TesseractAndCube);
var ocr = new TesseractEngine("./rus", "rus", EngineMode.TesseractAndCube);
var page = ocr.Process(img);
textBox1.Text = page.GetText();
}
}
Code works fine with English trained data, but it throws an error when I change it to Russian.
Here is the error:
Tesseract.TesseractException: "Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details."
My Tesseract version is 3.0.2.
I've downloaded Russian tessdata files from https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302
Upvotes: 1
Views: 4182
Reputation: 1
Confirmed that problem. Tesseract can run with single language (I've tried bul.traineddata). But "rus" always gives that result in logcat:
Could not initialize Tesseract API with language=rus!
Of cause I've had rus.traineddata file in assets :-)
Upvotes: 0
Reputation: 137
work for me
Tesseract tesseract = new Tesseract();
tesseract.setLanguage("rus");
try {
tesseract.setDatapath("/home/test/tessdata");
String text = tesseract.doOCR(new File("/home/test/Pictures/photo.jpg"));
System.out.print(text);
} catch (TesseractException e) {
e.printStackTrace();
}
test data - https://github.com/tesseract-ocr/tessdata
Upvotes: 2