Reputation: 622
I am making app for recognize digits (OCR) so I need to prepare imamge for it. There is no problem when i take photo to blue, green, yellow or other color but red digits become so gray after grayscale in OpenCV and these digits became unable to recognize.
Image after grayscale (yellow and red digits):
Image after threshold:
As you can see, after it red digits are gone.
Here is fragment of code, that I use:
mat.ConvertTo(mat, CvType.Cv8uc1);
Imgproc.CvtColor(mat, mat, Imgproc.ColorBgr2gray);
Imgproc.Threshold(mat, mat, 127, 255, Imgproc.ThreshBinary);
Any solutions?
Upvotes: 2
Views: 3517
Reputation: 5805
@Jeru Luke's solution should be fairly robust for a wide range of input images. But if you need raw speed, you might think about a simple brightness/contrast operation followed by global thresholding.
If you use brightness and contrast, which is computationally cheap, you can make the background become all black and then use global thresholding to get a nice binarized image.
Photo editors (Photoshop, Gimp, etc.) often use brightness/contrast scale of ±127. The mathematical formala for simultaneously adding brightness (b) and contrast (c) is
img = (1 + c/127)*img + (b-c)
If you have access to mat
from C#, then you can use the cv.mat.convertTo function:
cv.Mat.convertTo( OutputArray, cv.CV_8U, 1+c/127, b-c)
For your image I used b = -45 and c = +45
Then convert to grayscale and binarize (I used threshold of 50 on your image)
Update
The OP was tagged for C#. But many of us use Python. In Python, we have no access to Mat. However, we can use the cv2.addWeighted
function which does:
dst = src1*alpha + src2*beta + gamma
If we set beta = 0, then this becomes equivalent to cv.Mat.convertTo
scaling. This seems to be faster than doing matrix operations in Numpy. Numpy is a little slower because we have to do some extra stuff to handle overflow.
Upvotes: 2
Reputation: 21203
As I mentioned in the comments you can perform Otsu threshold to each of the color channels R, G, B.
Otsu threshold of Blue channel:
Otsu threshold of Green channel:
Otsu threshold of Red channel:
Finally I added all the above to get the following result:
I only used the following functions:
cv2.threshold()
cv2.add()
Code
import os
import cv2
import numpy as np
#--- performs Otsu threshold ---
def threshold(img, st):
ret, thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite(os.path.join(path, 'res_' + str(st) + '.jpg'), thresh)
return thresh
path = r'C:\Users\Desktop'
filename = 'digits.jpg'
img = cv2.imread(os.path.join(path, filename))
img = cv2.resize(img, (0, 0), fx = 0.5, fy = 0.5) #--- resized the image because it was to big
cv2.imshow('Original', img)
#--- see each of the channels individually ---
cv2.imshow('b', img[:,:,0])
cv2.imshow('g', img[:,:,1])
cv2.imshow('r', img[:,:,2])
m1 = threshold(img[:,:,0], 1) #--- threshold on blue channel
m2 = threshold(img[:,:,1], 2) #--- threshold on green channel
m3 = threshold(img[:,:,2], 3) #--- threshold on red channel
#--- adding up all the results above ---
res = cv2.add(m1, cv2.add(m2, m3))
cv2.imshow('res', res)
cv2.imwrite(os.path.join(path, 'res.jpg'), res)
cv2.waitKey()
cv2.destroyAllWindows()
Upvotes: 4