Reputation: 1
I'm stuck with a problem where I want to identify different coffee beans in a mix. I created a neural network which is able to identify different beans individually. But in practice, I want to develop a algorithm where I can detect these beans in a bigger batch. It is not necessary to identify all the beans in the picture, but when i'm able to identify 10-15 beans in a bigger batch, this would be enough.
The problem is now that I'm able to segment the beans when there is just one layer of beans with a contrasting background, but when there are multiple layers of beans below this first layer, it gets really hard.
I tried to use distance transform and the watershed algorithm from openCV and as mentioned, this worked for just single beans and for some small overlap between beans (just as in this example). The picture below shows the results: results of single layer segmentation
My code used was based on the example mention before:
import cv2
import numpy as np
from matplotlib import pyplot as plt
from scipy.ndimage import label
from scipy.ndimage import morphology
# load the image as normal and grayscale
img_path = "FINAL/segmentation/IMG_6699.JPG"
img= cv.imread(img_path,0)
img0 = cv.imread(img_path)
#preprocess the image
img= cv.medianBlur(img,5)
ret,th1 = cv.threshold(img,80,255,cv.THRESH_BINARY_INV)
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(th1, cv2.MORPH_OPEN, kernel)
dilation = cv2.dilate(opening, None, iterations=2)
erosion = cv2.erode(dilation,kernel,iterations = 50)
border_nonseg = dilation - cv2.erode(dilation, None, iterations = 1)
#distance transform
#dt = morphology.distance_transform_bf(dilation, metric='chessboard')
dt = cv2.distanceTransform(dilation, 2, 5)
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(np.uint8)
hier, dt1 = cv2.threshold(dt, 170, 255, cv2.THRESH_BINARY)
# label the centers found by the distance transform
lbl, ncc = label(dt1)
lbl = lbl * (255/ncc)
# Completing the markers now.
lbl[border_nonseg == 255] = 255
lbl = lbl.astype(np.int32)
# Watershed algorithm
cv2.watershed(img0, lbl)
lbl[lbl == -1] = 0
lbl = lbl.astype(np.uint8)
result = 255 - lbl
lbl_cont = result
# Draw the borders
result[result != 255] = 0
result = cv2.dilate(result, None, iterations=1)
img0[result == 255] = (255, 0, 0)
cv2.imwrite("output.png", img0)
contours, _ = cv.findContours(result, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
titles = ['Original Image', 'dilation',
'gradient morph', 'erode']
images = [ border_nonseg, dt, lbl_cont, img0]
plt.figure(figsize=(20,20))
for i in range(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
But the problem starts when there is a picture like this (which is a real situation): multi layer segmentation and harder multi layer segmentation
I don't think that I'm able to reuse the code mentioned before and that I need a different approach. Because the contrast between the first and second layer is just to small, because of the small size of the beans there is not a big shade created by them, which would give a nice contrast and the the color of the beans is also quite dark which doesn't make it easier.
So do you have any suggestion for different approaches to tackle this problem or maybe adjustment the the current code to solve for my problem?
I'm very curious to hear different opinions on this!
Upvotes: 0
Views: 182
Reputation: 850
If I've understood correcly, you just need some image segmentation methods to separate the beans from a big picture into many small pictures with just one bean in it so you can feed them to your NN to train/test it.
With the kind of pictures that you showed, the segmentation is almost an identification by itself. I mean, you would almost need a trained NN to identify the beans in the picture to then separate them and feed them into your non-trained NN.
For these kinds of problems, I believe that there are some NN architectures (non-supervised) that are trained to extract the relevant features for you. I think autoencoders was one of the options, but I'm not sure right now.
The other approach is to use some kind of more general pattern recognition:
a) Shape-Based: They try to match a contour model over the gradient image
b) Correlation-Based: They try to match a sample image over your original image using grayscale correlations
These methods use pyramidal search to increase speed but you might want as well try different pyramid levels of the model for each pyramid level of the image to analyze to cope with different zoomings of the grains, which is equivalent to a matching method with scaling. You will need also several models of your beans (different perspectives of a single bean) to increase the number of results per image.
You could try as well c) a region expansion methodthat expands a seed region to the neighboring pixels under some conditions based on color smoothness, or d) a border finder combined with some contour closing algorithms; but I fear they could cause you many problems based on your images variability.
Upvotes: 0