Reputation: 23
I am trying to detect black and white soccer balls almost purely by using image pre-processing techniques with OpenCV (in Python). My idea is as follows;
I'm stuck on finding the right candidates. Currently, this is my approach;
Step 2: The blurred image (medianblur, kernel 7)
Step 3: Generated binary image A Generated binary image B
Then I use findContours to find contours on the binary images. If no candidates are found on binary image B (using a minimum and maximum boundary box threshold), findContours will run on binary image A (and candidates will be returned). If one or more candidates are found on binary image B, then original image will be re-blurred (with kernel 15) and binary image C will be used for finding the contours and returning the candidates. See: Generated binary image C
This is the code for generating those binary images:
def generateMask(imgOriginal, rgb, margin):
lowerLimit = np.asarray(rgb)
upperLimit = lowerLimit+margin
# switch limits if margin is negative
if(margin < 0):
lowerLimit, upperLimit = upperLimit, lowerLimit
mask = cv.inRange(imgOriginal, lowerLimit, upperLimit)
return mask
# generates a set of six images with (combinations of) mask(s) applied
def applyMasks(imgOriginal, mask1, mask2):
# applying both masks to original image
singleAppliedMask1 = cv.bitwise_and(imgOriginal, imgOriginal, mask = mask1) #res3
singleAppliedMask2 = cv.bitwise_and(imgOriginal, imgOriginal, mask = mask2) #res1
# applying masks to overlap areas in single masked and original image
doubleAppliedMaskOv1 = cv.bitwise_and(
imgOriginal,
singleAppliedMask1,
mask = mask2
) #res4
doubleAppliedMaskOv2 = cv.bitwise_and(
imgOriginal,
singleAppliedMask2,
mask = mask1
) #res2
# applying masks to joint areas in single masked and original image
doubleAppliedMaskJoin1 = cv.bitwise_or(
imgOriginal,
singleAppliedMask1,
mask = mask2
) #res7
doubleAppliedMaskJoin2 = cv.bitwise_or(
imgOriginal,
singleAppliedMask2,
mask = mask1
) #res6
return (
singleAppliedMask1, singleAppliedMask2,
doubleAppliedMaskOv1, doubleAppliedMaskOv2,
doubleAppliedMaskJoin1, doubleAppliedMaskJoin2
)
def generateBinaries(appliedMasks):
# variable names correspond to output variables in applyMasks()
(sam1, sam2, damov1, damov2, damjo1, damjo2) = appliedMasks
# generate thresholded images
(_, sam1t) = cv.threshold(sam1, 0, 255, cv.THRESH_BINARY_INV)
(_, sam1ti) = cv.threshold(sam1, 0, 255, cv.THRESH_BINARY_INV)
(_, sam2t) = cv.threshold(sam2, 0, 255, cv.THRESH_BINARY)
(_, sam2ti) = cv.threshold(sam2, 0, 255, cv.THRESH_BINARY_INV)
(_, damov1t) = cv.threshold(damov1, 0, 255, cv.THRESH_BINARY)
(_, damov2t) = cv.threshold(damov2, 0, 255, cv.THRESH_BINARY_INV)
(_, damjo1t) = cv.threshold(damjo1, 0, 255, cv.THRESH_BINARY_INV)
(_, damjo2t) = cv.threshold(damjo2, 0, 255, cv.THRESH_BINARY)
# return differences in binary images
return ((damov2t-sam2t), (sam1t-damov1t), (sam2ti-damjo2t))
The result in this example image is good and very useful, even though it looks pretty wrong: see result.
It is very easy to get the result of this example image much better (for example, having only one or two candidates returned which includes a perfect bounding box for the soccer ball), however, after extensive parameter-tweaking the parameters I used in this example seem to produce the best overall recall.
However, I'm very stuck on certain photos of which I will show the original images, the binary A and B images (generated based on the original image median blurred with kernel 7) and the binary C image (kernel 15). Currently my approach returns an average of about 15 candidates per photo of which, for 25% of the photos, at least a perfect bounding box of the ball is included, and for about 75% of the photos, at least a bounding box is included which is partially correct (e.g. having a piece of the ball in the bounding box, or just being a piece of the ball itself).
Original images + binary images A
Binary images B + binary images C
(I could only post up to 8 links)
I hope you guys could give my some suggestions on how to proceed.
Upvotes: 1
Views: 659
Reputation: 561
There are a lots of possibility on how to do this. Probably using neural network is a good choice, but you still need to understand and train one of them for your task.
You can use thresholding and gaussian blurring, and as a suggestion I can add using normalized cross correlation for template matching. Basically you take a template (an image of the ball, in your case, or even better, a set of images at different sizes, since ball may have varying size based on the position).
Then you iterate on the image and check when the template is matching. Of course this won't work on images with occlusion, but it may help getting some candidates.
More details about the mentioned process in the paper here (https://ieeexplore.ieee.org/document/5375779) or slides here (http://www.cse.psu.edu/~rtc12/CSE486/lecture07.pdf).
I wrote a small snippet of code to show you the idea. Just cropped the ball from the image (so I cheated, but it is just to show the idea). It also uses only the differnece between ball and image, while a more sophisticated measure (like NCC) would be better, but as said, is an example.
import matplotlib.pyplot as plt
import numpy as np
import pdb
import cv2
def rgb2gray(rgb):
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
if __name__ == "__main__":
ball = plt.imread('ball.jpg');
ball = rgb2gray(ball);
findtheballcol = plt.imread('findtheball.jpg');
findtheball = rgb2gray(findtheballcol)
matching_img = np.zeros((findtheball.shape[0], findtheball.shape[1]));
#METHOD 1
width = ball.shape[1]
height = ball.shape[0]
for i in range(ball.shape[0], findtheball.shape[0]-ball.shape[0]):
for j in range(ball.shape[1], findtheball.shape[1]-ball.shape[1]):
# here use NCC or something better
matching_score = np.abs(ball - findtheball[i:i+ball.shape[0], j:j+ball.shape[1]]);
# inverting so that max is what we are looking for
matching_img[i,j] = 1 / np.sum(matching_score);
plt.subplot(221);
plt.imshow(findtheball);
plt.title('Image')
plt.subplot(222);
plt.imshow(matching_img, cmap='jet');
plt.title('Matching Score')
plt.subplot(223);
#pick a threshold
threshold_val = np.mean(matching_img) * 2; #np.max(matching_img - (np.mean(matching_img)))
found_at = np.where(matching_img > threshold_val)
show_match = np.zeros_like(findtheball)
for l in range(len(found_at[0])):
yb = round(found_at[0][l]-height/2).astype(int)
yt = round(found_at[0][l]+height/2).astype(int)
xl = round(found_at[1][l]-width/2).astype(int)
xr = round(found_at[1][l]+width/2).astype(int)
show_match[yb: yt, xl: xr] = 1;
plt.imshow(show_match)
plt.title('Candidates')
plt.subplot(224)
# higher threshold
threshold_val = np.mean(matching_img) * 3; #np.max(matching_img - (np.mean(matching_img)))
found_at = np.where(matching_img > threshold_val)
show_match = np.zeros_like(findtheball)
for l in range(len(found_at[0])):
yb = round(found_at[0][l]-height/2).astype(int)
yt = round(found_at[0][l]+height/2).astype(int)
xl = round(found_at[1][l]-width/2).astype(int)
xr = round(found_at[1][l]+width/2).astype(int)
show_match[yb: yt, xl: xr] = 1;
plt.imshow(show_match)
plt.title('Best Candidate')
plt.show()
Have fun!
Upvotes: 0
Reputation: 1927
Also you can use blackhat and tophat morphology operatios for find a nested black parts of ball in white parts. It will be mo robust than thresholds.
Upvotes: 0