Reputation: 550
I am using opencv to take images using my webcam.
cam = cv2.VideoCapture(0)
cv2.namedWindow("Handwritten Number Recognition")
img_counter = 0
while True:
ret, frame = cam.read()
if not ret:
print("failed to grab frame")
break
cv2.imshow("Handwritten Number Recognition", frame)
k = cv2.waitKey(1)
if k%256 == 27:
# ESC pressed
print("Prediction is underway...")
break
elif k%256 == 32:
# SPACE pressed
img_name = "opencv_frame_{}.png".format(img_counter)
cv2.imwrite(img_name, frame)
print("Image taken!")
img_counter += 1
cam.release()
cv2.destroyAllWindows()
I then convert the image into grayscale and downsize it:
user_test = img_name
col = Image.open(user_test)
gray = col.convert('L')
bw = gray.point(lambda x: 0 if x<100 else 255, '1')
bw.save("bw_image.jpg")
bw
img_array = cv2.imread("bw_image.jpg", cv2.IMREAD_GRAYSCALE)
img_array = cv2.bitwise_not(img_array)
plt.imshow(img_array, cmap = plt.cm.binary)
plt.show()
img_size = 28
new_array = cv2.resize(img_array, (img_size,img_size))
final_array = new_array.reshape(1,-1)
plt.imshow(new_array, cmap = plt.cm.binary)
plt.show()
But the images have a very dark patch in the bottom which hampers the predictions I want to make with my data:
What can I do to get past this problem? Interestingly this only happens if I click the image using opencv. If I use an image clicked through the same webcam but through the camera appliucation the error is not visible (Adding path of the image for preprocessing).
Upvotes: 0
Views: 249
Reputation: 53081
Here is one way to do that in Python/OpenCV using division normalization, thresholding and some morphology.
Input:
import cv2
import numpy as np
# read the image
img = cv2.imread('five.jpg')
# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# blur
smooth = cv2.GaussianBlur(gray, (555,555), 0)
# divide smooth by gray image
division = cv2.divide(smooth, gray, scale=255)
# invert
division = 255 - division
# threshold
thresh = cv2.threshold(division, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# add white border to help morphology close
border = cv2.copyMakeBorder(thresh, 60,60,60,60, cv2.BORDER_CONSTANT, value=(255,255,255))
hh, ww = border.shape
# morphology close
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (29,29))
result = cv2.morphologyEx(border, cv2.MORPH_CLOSE, kernel)
# remove border
result = result[60:hh-60, 60:ww-60]
# save results
cv2.imwrite('five_division_threshold.jpg',result)
# show results
cv2.imshow('smooth', smooth)
cv2.imshow('division', division)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Upvotes: 2
Reputation: 1526
You have two options:
Choosing a better global threshold for the gray values. This is the easier less generic solution. Normally, people would choose the Otsu method to automatically select the optimal threshold. Have a look at: Opencv Thresholding Tutorial
threshold, dst_img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
Using an adaptive threshold. Adaptive simply means using a calculated threshold for each sliding window location based on some criteria. Have a look at: Niblack's Binarization methods
Using option one:
img = cv2.imread("thresh.jpg", cv2.IMREAD_GRAYSCALE)
threshold, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
cv2.imwrite("thresh_bin.jpg", img)
Upvotes: 3
Reputation: 94
this problem because of used thresholding method
bw = gray.point(lambda x: 0 if x<100 else 255, '1')
to solve this you can change the low limit value (100) to 75 or using opencv auto threshold
the, bw = cv2.threshold(gray_img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Upvotes: 2