How to detect text using Tesseract on images with poor camera angles?

Question

I'm working on extracting text on images, that are similar to the one shown below: Warehouse boxes with all kinds of different labels. Images often have poor angles.

My code:

im = cv2.imread('1.jpg')
config = ('-l eng --oem 1 --psm 3')
text = pytesseract.image_to_string(im, config=config)
text_list = text.split('
')
# remove blanks of varying sizes so that only words are returned
space_to_empty = [x.strip() for x in text_list]
space_clean_list = [x for x in space_to_empty if x]
print(space_clean_list)

For example, that image

here

returns an output of

['L2 Sy', "////’7/'7///////////////"]

on all variations of --oem and --psm values.

Perspective correction for the image

here

gives a slightly better output (though still poor) of

['R19 159 942 sEMY', 'V/ ////////////////////I////I/////////////']

again, on all variations of --oem and --psm values.

My questions are:

Why does Tesseract seem to perform so badly on such images with poor perspectives, compared to other alternatives like Vision API and PaddleOCR which are able to extract text fairly well. Is this an issue that can be corrected through some sort of fine-tuning in Tesseract? Or is this a weak point of Tesseract that has to be addressed with preprocessing (such as blurring, threshold, etc)? If that is the case, the alternate solutions above seem better as they do not require such preprocessing.
Despite changing the values for --oem and --psm as shown here, the output stays the same. Is this expected?

How to detect text using Tesseract on images with poor camera angles?

Answers (1)

Related Questions