Kristián Stroka
Kristián Stroka

Reputation: 740

Optimize various brightness image for OCR using OpenCV

I have following types of images:

Example1 Example2 Example3 Example4 Example5 Example6 Example7 Example8 Example9

I would like to preprocess them to do best OCR result, but as you can see they are in different brightness and different sharpness ... is possible to do some "generic" adjustments to extract text for OCR with the best result?

Upvotes: 0

Views: 973

Answers (1)

Berlin Benilo
Berlin Benilo

Reputation: 502

You can use easy ocr is giving proper result for these cases. This will work for blurred and unblurred cases.

import easyocr
import cv2
import numpy as np
from PIL import Image, ImageEnhance


def unsharp_mask(image, kernel_size=(5, 5), sigma=1.0, amount=1.0, threshold=0):
    """Return a sharpened version of the image, using an unsharp mask."""
    blurred = cv2.GaussianBlur(image, kernel_size, sigma)
    sharpened = float(amount + 1) * image - float(amount) * blurred
    sharpened = np.maximum(sharpened, np.zeros(sharpened.shape))
    sharpened = np.minimum(sharpened, 255 * np.ones(sharpened.shape))
    sharpened = sharpened.round().astype(np.uint8)
    if threshold > 0:
        low_contrast_mask = np.absolute(image - blurred) < threshold
        np.copyto(sharpened, image, where=low_contrast_mask)
    return sharpened

def increase_brightness(img, value):
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    h, s, v = cv2.split(hsv)

    lim = 255 - value
    v[v > lim] = 255
    v[v <= lim] += value

    final_hsv = cv2.merge((h, s, v))
    img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
    return img

image = cv2.imread('if8nC.png')
sharpened = unsharp_mask(image)
imag = increase_brightness(sharpened, value=10) # 60 ->5qoOk.png #10 -> if8nC.png
cv2.imwrite('resize.png',imag)

reader = easyocr.Reader(['en'],gpu=False)
result = reader.readtext('resize.png')
for detection in result:
        print(detection)

The only adjustment u have to make is change in brightness value from 0 to 100. It worked for all cases. The output is

([[1, 0], [282, 0], [282, 68], [1, 68]], 'Tvrdosin', 0.4517089309490733)

Upvotes: 1

Related Questions