Does Tesseract do image resizing internally?

Question

OpenCv doesn't read the metadata of the image. So that, we can't get the dpi of an image. When someone asks about dpi related ocr questions in stackoverflow,

Most of the answers said we don't need DPI. We only need a pixel size.

Changing image DPI for usage with tesseract

Change dpi of an image in OpenCV

In some places, where no one asks about dpi and needs to improve the OCR accuracy someone's come up with the idea that setup DPI to 300 will improve the accuracy.

Tesseract OCR How do I improve result?

Best way to recognize characters in screenshot?

One more thing is, Tesseract said on their official page about that

Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images.

After some google search, I have found the following things.

We can't tell the image resolution based on height and width
We want an image resolution is high enough to support accurate OCR.
Font size typically means unit length and not pixels like if we have 72 points we have one inch. font size 12pt means 1/6 inchs.
When we have 300 ppi image with a 12pt fontsize texts then the text pixel size is 300 1/6 = 50 pixels. If we have 60 ppi then the text pixel size is 601/6 =10 pixels.

Below quoted one is from the tesseract official page. Is there a Minimum / Maximum Text Size? (It won’t read screen text!)

There is a minimum text size for reasonable accuracy. You have to consider resolution as well as point size. Accuracy drops off below 10pt x 300dpi, rapidly below 8pt x 300dpi. A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.) At 10pt x 300dpi x-heights are typically about 20 pixels, although this can vary dramatically from font to font. Below an x-height of 10 pixels, you have very little chance of accurate results, and below about 8 pixels, most of the text will be “noise removed”.

Using LSTM there seems also to be a maximum x-height somewhere around 30 px. Above that, Tesseract doesn’t produce accurate results. The legacy engine seems to be less prone to this (see https://groups.google.com/forum/#!msg/tesseract-ocr/Wdh_JJwnw94/24JHDYQbBQAJ).

From these things, I come to one solution that is, We need a 10 to 12 pt font size text for the OCR. which means If we have 120 ppi(pixel per inch) then we need a height of 20-pixel size. if we have 300 ppi then we need a 50-pixel height for the text.

If Opencv doesn't read the dpi information. What is the default dpi value to tesseract input from an image which is got by imread method of OpenCV?
Does Tesseract do image resizing based on the dpi of an image internally?
If I do resizing the image using opencv then i need to set the dpi to 300 dpi if resizing happens based on dpi internally. What is the easiest way to set up the DPI in OpenCV + pytesseract? but we can do this with PIL

Does Tesseract do image resizing internally?

Answers (1)

Manual resizing

Automatic resizing

Related Questions