Sandro Antidze
Sandro Antidze

Reputation: 13

How to separate (space between) numbers in tesseract OCR

I try to get numbers from image

but after submitting my result is 2 332223355 1 23, i don't really understand how does it splits, everything i need is to split one, two and three digit numbers with space. can anybody help me?

Upvotes: 1

Views: 1082

Answers (1)

user7711283
user7711283

Reputation:

Use:

tesseract -psm 7 NXect.png stdout

which gives for the image you provided:

2 3 32 22 33 55 123‘

The tesseract version I am using:

$ tesseract --version
tesseract 3.04.01
 leptonica-1.73
  libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

gives me for the original image without any options:

Error in pixGenHalftoneMask: pix too small: w = 250, h = 58
23 32 22 33 55 123

and for the resized image (2x):

$ tesseract  NXect_x2.png stdout
23 32 22 33 55 123

so I can't confirm the OCR result you are getting out the image.

Upvotes: 3

Related Questions