user1910524
user1910524

Reputation: 113

Tessnet2 using Tesseract Engine - Why does it give very bad output?

I am trying to use the Tessnet2 using Tesseract engine in C#. For many of the test images I give to Tessnet2, the output is very bad, and almost nothing is correct.

This is my code in the C# console project, Program.cs class:

 static void Main(string[] args)
    {
        try
        {
        Bitmap image = new Bitmap(@"C:\Users\hp\Desktop\eurotext.tif");
        var ocr = new Tesseract();

        //when I tried to add the SetVariable(...), it didn't change the output much

        ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", true);

        var result = ocr.DoOCR(image, Rectangle.Empty);
        foreach (Word word in result)
            Console.WriteLine("{0} : {1}", word.Confidence, word.Text);

        Console.ReadLine();
    }
    catch (Exception exception)
    {
        Console.WriteLine("Error");
    }
}

For example, this is a sample (large binary 300 dpi) test image "eurotext.tif": enter image description here

And this is the Tessnet2 output for this image: enter image description here

I have been using this website to learn the steps to use Tessnet2: https://code.msdn.microsoft.com/windowsdesktop/How-to-use-Tessnet2-library-716be12f

I used this website to try to correctly use the SetVariable(...) function to make it do what I want, but with no luck and not much difference in the output: http://www.sk-spell.sk.cx/tesseract-ocr-en

I found the Tesseract guidelines to reduce the error of the engine: http://code.google.com/p/tesseract-ocr/wiki/ImproveQuality

I looked everywhere for a solution that can increase the accuracy, and I found many posts and people with similar problems, but with no working solution.

What could be the reason for this problem? How can I solve it?

I am a beginner in this topic, so please bear with me if the solution is too trivial.

Thanks!

Upvotes: 5

Views: 6852

Answers (1)

SimpleCodeer
SimpleCodeer

Reputation: 46

To get the text to display you have to change:

ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", true);

to:

ocr.Init(@"C:\Program Files (x86)\Tesseract-OCR", "eng", false);

Upvotes: 3

Related Questions