Reputation: 35
Im trying to implement a digit recognition program for Video capture in openCV. It works with normal(still) pictures as input, but when I add the video capture functionality it gets stuck while recording, if I move the camera around. My code for the program is here:
import numpy as np
import cv2
from sklearn.externals import joblib
from skimage.feature import hog
# Load the classifier
clf = joblib.load("digits_cls.pkl")
# Default camera has index 0 and externally(USB) connected cameras have
# indexes ranging from 1 to 3
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# Convert to grayscale and apply Gaussian filtering
im_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
im_gray = cv2.GaussianBlur(im_gray, (5, 5), 0)
# Threshold the image
ret, im_th = cv2.threshold(im_gray.copy(), 120, 255, cv2.THRESH_BINARY_INV)
# Find contours in the binary image 'im_th'
_, contours0, hierarchy = cv2.findContours(im_th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Draw contours in the original image 'im' with contours0 as input
# cv2.drawContours(frame, contours0, -1, (0,0,255), 2, cv2.LINE_AA, hierarchy, abs(-1))
# Rectangular bounding box around each number/contour
rects = [cv2.boundingRect(ctr) for ctr in contours0]
# Draw the bounding box around the numbers
for rect in rects:
cv2.rectangle(frame, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (0, 255, 0), 3)
# Make the rectangular region around the digit
leng = int(rect[3] * 1.6)
pt1 = int(rect[1] + rect[3] // 2 - leng // 2)
pt2 = int(rect[0] + rect[2] // 2 - leng // 2)
roi = im_th[pt1:pt1+leng, pt2:pt2+leng]
# Resize the image
roi = cv2.resize(roi, (28, 28), im_th, interpolation=cv2.INTER_AREA)
roi = cv2.dilate(roi, (3, 3))
# Calculate the HOG features
roi_hog_fd = hog(roi, orientations=9, pixels_per_cell=(14, 14), cells_per_block=(1, 1), visualise=False)
nbr = clf.predict(np.array([roi_hog_fd], 'float64'))
cv2.putText(frame, str(int(nbr[0])), (rect[0], rect[1]),cv2.FONT_HERSHEY_DUPLEX, 2, (0, 255, 255), 3)
# Display the resulting frame
cv2.imshow('frame', frame)
cv2.imshow('Threshold', im_th)
# Press 'q' to exit the video stream
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
The error i get, is that there is no input at the resize ROI(region of interest). I find it weird because it works as long as I don't move thing around too much in the picture. Im sure that it isn't the camera that in at fault, since I've tried a lot of different cameras. Here is the specific error message:
Traceback (most recent call last):
File "C:\Users\marti\Desktop\Code\Python\digitRecognition\Video_cap.py", line 55, in <module>
roi = cv2.resize(roi, (28, 28), im_th, interpolation=cv2.INTER_AREA)
cv2.error: D:\Build\OpenCV\opencv-3.2.0\modules\imgproc\src\imgwarp.cpp:3492: error: (-215) ssize.width > 0 && ssize.height > 0 in function cv::resize
Picture of the program in action, if a move the numbers around the program freezes
Upvotes: 0
Views: 2188
Reputation: 1166
What about trying this:
if roi.any():
roi = cv2.resize(roi, (28, 28), frame, interpolation=cv2.INTER_AREA)
roi = cv2.dilate(roi, (3, 3))
I think this does what you want (I simplified yours for the example):
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
frame2=frame.copy()
# Convert to grayscale and apply Gaussian filtering
im_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
im_gray = cv2.GaussianBlur(im_gray, (5, 5), 0)
ret, im_th = cv2.threshold(im_gray.copy(), 120, 255, cv2.THRESH_BINARY_INV)
# Find contours in the binary image 'im_th'
_, contours0, hierarchy = cv2.findContours(im_th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Rectangular bounding box around each number/contour
rects = [cv2.boundingRect(ctr) for ctr in contours0]
# Draw the bounding box around the numbers
for rect in rects:
cv2.rectangle(frame, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (255, 0, 255), 3)
# Make the rectangular region around the digit
leng = int(rect[3] * 1.6)
pt1 = int(rect[1] + rect[3] // 2 - leng // 2)
pt2 = int(rect[0] + rect[2] // 2 - leng // 2)
roi = im_th[pt1:pt1+leng, pt2:pt2+leng]
# Resize the image
if roi.any():
roi = cv2.resize(roi, (28, 28), frame, interpolation=cv2.INTER_AREA)
roi = cv2.dilate(roi, (3, 3))
# Display the resulting frame
cv2.imshow('frame', frame)
#cv2.imshow('Threshold', im_th)
# Press 'q' to exit the video stream
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Upvotes: 1
Reputation: 3073
You're using a fixed threshold for the preprocessing before trying to find contours. Since cv2.resize()
has to resize something, it expects the roi matrix to have non-zero width and height. I'm guessing that at some point when you're moving the camera, you don't detect any digits, because of your non-adaptive preprocessing algorithm.
I suggest that you display the thresholded image and an image with contours superimposed on the frame while moving the camera. This way you'll be able to debug the algorithm. Also, you make sure to print(len(rects))
to see if any rectangles have been detected.
Another trick would be to save the frames and run the algorithm on the last frame saved before crashing, to find out why that frame is causing the error.
Summarizing, you really need to take control over your code if you expect it to produce meaningful results. The solution - depending on your data - might be using some kind of contrast enhancement before the thresholding operaton and/or using the Otsu's Method or Adaptive Thresholding with some additional filtering.
Upvotes: 2