Reputation: 166
I have calculated the Sobel gradient magnitude and direction. But I'm stuck on how to use this further for shape detection.
Image> Grayscaled> Sobel filtered> Sobel gradient and direction calculated> next?
The Sobel kernels used are:
Kx = ([[1, 0, -1],[2, 0, -2],[1, 0, -1]])
Ky = ([[1, 2, 1],[0, 0, 0],[-1, -2, -1]])
(I have restriction to only use Numpy and no other library with language Python.)
import numpy as np
def classify(im):
#Convert to grayscale
gray = convert_to_grayscale(im/255.)
#Sobel kernels as numpy arrays
Kx = np.array([[1, 0, -1],[2, 0, -2],[1, 0, -1]])
Ky = np.array([[1, 2, 1],[0, 0, 0],[-1, -2, -1]])
Gx = filter_2d(gray, Kx)
Gy = filter_2d(gray, Ky)
G = np.sqrt(Gx**2+Gy**2)
G_direction = np.arctan2(Gy, Gx)
#labels = ['brick', 'ball', 'cylinder']
#Let's guess randomly! Maybe we'll get lucky.
#random_integer = np.random.randint(low = 0, high = 3)
return labels[random_integer]
def filter_2d(im, kernel):
'''
Filter an image by taking the dot product of each
image neighborhood with the kernel matrix.
'''
M = kernel.shape[0]
N = kernel.shape[1]
H = im.shape[0]
W = im.shape[1]
filtered_image = np.zeros((H-M+1, W-N+1), dtype = 'float64')
for i in range(filtered_image.shape[0]):
for j in range(filtered_image.shape[1]):
image_patch = im[i:i+M, j:j+N]
filtered_image[i, j] = np.sum(np.multiply(image_patch, kernel))
return filtered_image
def convert_to_grayscale(im):
'''
Convert color image to grayscale.
'''
return np.mean(im, axis = 2)
Upvotes: 3
Views: 1329
Reputation:
You can use the following distinctive characteristics of your shapes:
a brick has several straight edges (from four to six, depending on the point of view);
a sphere has a single curved edge;
a cylindre has two curved edges and to straight edges (though they can be completely hidden).
Use binarization (based on luminance and/or saturation) and extract the outlines. Then find the straight sections, possibly using the Douglas-Peucker simplification algorithm. Finally, analyze the sequences of straight and curved edges.
A possible way to address the final classification task, is to represent the outlines as a string of chunks, either straight or curved, with a rough indication of length (short/medium/long). With imperfect segmentation, every shape will correspond to a set of patterns.
You can work with a training phase to learn a maximum of patterns, then use string matching (where the strings are seen as loops). There will probably be ties to be arbitrated. Another option is approximate string matching.
Upvotes: 1