Gery
Gery

Reputation: 9036

Further improvements with Imagemagick for text recognition with Tesseract-OCR

Original PNG:

[original1

With this:

convert original.png -channel RGB -negate -white-threshold 70% -fuzz 10% -transparent white improved.png

Improved PNG:

improved

The problem is that text cannot be extracted correctly with Tesseract-OCR, it only outputs part of the last two rows (no lat-long labels):

tesseract improved.png improved

cat improved.txt

14° 29.9808' S
76° 15.7617' W

How could the convert call be further improved to correctly extract the text? Any hints are appreciated.

Upvotes: 0

Views: 667

Answers (1)

fmw42
fmw42

Reputation: 53109

In Imagemagick, you can try

convert original.png -colorspace gray -threshold 25% -morphology open diamond:1 result.png

enter image description here

Upvotes: 1

Related Questions