Reputation: 335
I have to count eggs on a conveyor belt. The eggs can be seen in various ways.
Small, large, odd-shaped, dirty, cracked, broken, broken empty inside, broken
with liquid inside, next to chicken feather, many eggs touching to each others,
even some eggs might be sitting on top of some group of other eggs.
My challenge is to count the eggs with utmost accuracy. Besides, if possible, to classify/count the abnormal eggs as I mentioned above.
I have already some solution running on Jetson Nano. It counts the eggs by finding counters against relatively dark background (background subtraction). It does a reasonable good job at some degree although it is slow.
My question(s);
Now, I want to do this more with deep-learning models
using object detection
and object tracking
in a single algorithm together. This effort still can be considered as experiment for me.
First thing first, I need to have some image sets so I need some advises on that.
The eggs will always be on a conveyor belt coming along together with very similar type of them (in terms of color, shape and size).
What I am not sure where/how to take shots. Do I have to take the shots with the object's natural environment/background? And, how? should I take shot of each possible egg appearances as I listed above by putting them on the conveyor belt one by one and also changing their orientations a bit each time?
Or Should I take each shot again one by one on with a white background and again and also changing their orientations a bit each time?
A sample appearance from the conveyor belt:
Upvotes: 2
Views: 479
Reputation: 11218
https://github.com/developer0hye/Yolo_Label (works great, but only for windows)
https://github.com/AlexeyAB/Yolo_mark
https://github.com/heartexlabs/label-studio (this is a more complex annotation tool for many other tasks)
I would suggest to go with darknet YOLO, which is written in C++, you wouldn't need to write any major code, it will be fast and accurate.
https://pjreddie.com/darknet/yolo/
Use this repo if you're on Linux https://github.com/pjreddie/darknet
Use this one if you're on Windows https://github.com/AlexeyAB/darknet
https://github.com/zabir-nabil/yolov3-anchor-clustering
https://github.com/zabir-nabil/tf-model-server4-yolov3
About image resolution: The default dimension for yolov3 is (416, 416) which should be enough for your case. So, you should take images with the same/similar camera that you'll use in the actual development environment. A PI camera should be enough, you can use better cameras too but in the end you have to resize all of them to (416,416) dimension.
This is a two class problem, so for positive class you need slightly more images. Here's a rough estimate how you can generate the samples. Let's say the range of your egg counting model will be 0 - 25.
20% images with 0 eggs The rest 80% will be form a somewhat uniform or flat gaussian distribution meaning, if 80% == 1000 images, the count of images with 1 egg will be 1000/(range) = 1000/(25-1+1) = 1000/25 = 40, same for others (2-25).
For brightness, contrast, lighting, you should just go with the one which will be very close to the actual deployment scenario, the augmentation will take care of the rest. Yolov3 is very robust, so you don't need to worry about the background noise too much.
There is no major difference if you are using different image formats, usually .jpg will give you a small file size, so easy for the storage.
Upvotes: 3