Reputation: 178
I have this set of images I want to de-noise in order to run OCR on it:
I am trying to read the 7810 from the image.
I have tried
cv2.threshold(img, 128, 255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
cv2.fastNlMeansDenoising(img,None,60,10,20)
and some morphological expressions but none seem to work to clear this image sufficiently.
Any recommendations on how to filter this image sufficiently that I could run OCR or some ML detection scripts on this like pytesseract?
Upvotes: 3
Views: 2690
Reputation: 46600
You can try using cv2.adaptiveThreshold
since your image has different lighting conditions in different areas.
import cv2
image = cv2.imread("1.jpg",0)
thresh = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,21,2)
cv2.imshow('thresh', thresh)
cv2.waitKey(0)
Upvotes: 1
Reputation: 163
You could begin by using a Median filter to remove the salt & pepper noise:
cv2.medianBlur(source, 3)
Then try out the Otsu thresholding as you have done. This might not end up being the solution, but it makes it easier for the text detection algorithm to work on the image
Upvotes: 2