Reputation: 6562
I have a dataset of around 20K images that are human labelled. Labels are as follows: Label = 1 if the image is sharp and well lit, and Label = 0 for those blurry/out of focus/grainy images.
The images are of documents such as Identity cards.
I want to build a Computer Vision model that can do the classification task.
I tried using VGG-16 for transfer learning for this task but it did not give good results (precision .65 and recall = .73). My sense is that VGG-16 is not suitable for this task. It is trained on ImageNet and has very different low level features. Interestingly the model is under-fitting.
We also tried EfficientNet 7. Though the model was able to decently perform on training and validation, test performance remains bad.
Can someone suggest more suitable model to try for this task?
Upvotes: -1
Views: 593
Reputation: 163
I would look at the change in color between pixels, then rank the photos on the median delta between pixels... a sharp change from RGB (0,0,0) to (255,255,255) on each of the adjoining pixels would be the max possible score, the more blur you have the lower the score.
I have done this in the past trying to estimate areas of fields with success.
Upvotes: 0
Reputation: 114926
I think your problem with VGG and other NN is the resizing of images:
VGG expects as input 224x224 size image. I assume your dataset has much larger resolution, and thus you significantly downscale the input images before feeding them to your network.
What happens to blur/noise when you downscale an image?
Blurry and noisy images become sharper and cleaner as you decrease the resolution. Therefore, in many of your training examples, the net sees a perfectly good image while you label them as "corrupt". This is not good for training.
An interesting experiment would be to see what types of degradations your net can classify correctly and what types it fails: You report 65% precision @ 73% recall. Can you look at the classified images at that point and group them by degradation type?
That is, what is precision/recall for only blurry images? what is it for noisy images? What about grainy images?
What can you do?
Upvotes: 1
Reputation: 132
For this task, I think using opencv is sufficient to solve the issue. In fact comparing the variance of Lablacien of the image with a threshold (cv2.Laplacian(image, cv2.CV_64F).var()
) will generate a decision if an image is bluered or not.
You ca find an explanation of the method and the code in the following tutorial : detection with opencv
I think that training a classifier that takes the output of one of one of your neural network models and the variance of Laplacien as features will improve the classification results.
I also recommend experementing with ResNet and DenseNet.
Upvotes: 0