Caleb A.
Caleb A.

Reputation: 23

ImageMagick to preprocess image for tesseract-ocr

Is there anyway to process an image like this with ImageMagick so that I can use tesseract-ocr to convert it to text?

Because of the lines in the background I get nonsense from conventional methods. Does anyone know how to deal with an image such as this?

'convert -density 300 -units PixelsPerInch -type Grayscale +compress input.png input.tif' followed by 'tesseract input.tif output -l eng' gives me utter garbage.

Or are there any alternatives to ImageMagick that I can use to pre-process such an image whether through command-line or in python?

Upvotes: 1

Views: 4393

Answers (1)

Aleksander Grzyb
Aleksander Grzyb

Reputation: 789

Have you tried morphology operations Morphology of Shapes after converting image to grayscale?

Upvotes: 1

Related Questions