ravula mounika
ravula mounika

Reputation: 11

Unable to read handwritten text from form using node-tesseract

I'm unable to read the form exactly on using node-tesseract.Only the printed text of the form is recognized and returned correctly whereas the handwritten text is returned with some special characters.

My code is,

var options = {
            l: 'deu',
            psm: 6,
            env: {
                maxBuffer: 4096 * 4096
            }
        };
        tesseract.process('./server/images/form.jpg', options, function (err,text) {
            if (err) {
                return console.log("An error occured: ", err);
            }
            console.log("Recognized text:");
            console.log(text);
        });

my input ------> OWNER Brian Dude output------> OW_NER ägga ] )ggé;= ‘

here, OWNER is some text filed here

Upvotes: 1

Views: 1289

Answers (2)

akozlu
akozlu

Reputation: 111

  1. Take a look at the following papers. Both are examples that use Tesseract Training process for handwriting recognition.

Tesseract Training for Handwritten Digit Recognition

Training Tesseract for Roman Font Handwriting

  1. Check out the official Tesseract Training page.

  2. The following link takes you through the Training Process, it helped me a lot. https://web.archive.org/web/20170820212334/http://www.resolveradiologic.com:80/blog/2013/01/15/training-tesseract

  3. Use a third party GUI for Tesseract Training, it will make your life much easier. I recommend tesseract4java and jTessBoxEditor (both work on OS X)

Upvotes: 3

yanana
yanana

Reputation: 2331

You can train tesseract to recognize your handwritten text. See here.

Upvotes: 0

Related Questions