balajichinna
balajichinna

Reputation: 413

How to set Image size for Improving OCR output.?

I am working information reading from MRZ(Machine Readable Zone) image using Tesseract Library.I had tried some google images and I got good results.But when I went to real time images,that is when images are captured from iphone camera, I did not get good result.

Got good results for the below google image

enter image description here

Image size of the above image

It is google image.Size is 543x83.

OCR performs poor when I took image from iphone

enter image description here

Above image details.

Image captured from Iphone.Image size 2205x268

1.How to get good results for the above realtime image.?

2.Is there any recommended image size needed for Tesseract OCR?

Upvotes: 2

Views: 1794

Answers (1)

Mark Setchell
Mark Setchell

Reputation: 207365

I have used ImageMagick for this kind of thing with some success - it is free and available for OSX, Windows and Linux from here. It is very hard to find general purpose parameters and this took a fair amount of fiddling around:

#!/bin/bash

# Enhance image as much as possible for Tesseract OCR
convert input.jpg -normalize  \( -clone 0 -colorspace gray -negate -lat 50x50+10% -contrast-stretch 0 -blur 1x65535 -level 50x100% \) -compose copy_opacity -composite -opaque none -background white -adaptive-blur 3 out.jpg

# OCR the image and cat the results
tesseract out.jpg p && cat p.txt

OCR'ed Text Output:

IDFRADOUEL<<<<<<<<<<<<<<<<<<<<932013
U506932020438CHRISTIANE<<NI2906209F3

And this is the image, as prepared by the above command for OCR:

enter image description here

Upvotes: 2

Related Questions