Reputation: 159
I've been trying to train my own Haar cascade classifier using opencv, but 15 or 20 attempts have all produced too many false positives and false negatives. I've tried multiple things, but none of them have worked. What am I doing wrong?
Here, is a typical processed image, after doing the steps below. As you can see, I have only one or two true positives and many false positives and many false negatives.
To get images of faces, in the attempt I'll describe here, I used kaggle.
In this attempt, for the negative images, I used this data set on github.
I put all positive images in one folder and all negative images in another. Since the positive images have a face in the same exact spot, my text file (positives.txt) describing the positive images looks like this (but with a few thousand lines):
famous_people/Aaron_Eckhart_0001.jpg 1 50 32 141 184
famous_people/Aaron_Guiel_0001.jpg 1 50 32 141 184
famous_people/Aaron_Patterson_0001.jpg 1 50 32 141 184
...
Here is what one of the images looks like with a box around his face:
I looked through many images with the above box drawn, and they all have the box drawn like this. (In another attempt, I made the bounding box smaller, but that didn't help.)
I ran this command (in Windows 10, but I'd guess my OS doesn't matter here):
"path_to\opencv_createsamples.exe" -info positives.txt -w 20 -h 20 -num 5000 -vec pos.vec
I am unsure if the "parse errorDone" below is a problem, but the following is the final line of output printed to my terminal by the above command:
positives.txt(4002) : parse errorDone. Created 4001 samples
Also, my text file (github_negatives.txt) describing the negative images looks like this:
haartraining-master/data/negatives/neg-0002.jpg
haartraining-master/data/negatives/neg-0003.jpg
...
I then ran this command (where I save the trained cascade in the "cascade" folder):
"path_to\opencv_traincascade.exe" -data cascade -vec pos.vec -bg github_negatives.txt -w 20 -h 20 -numPos 500 -numNeg 2000 -numStages 8
Here is the program I ran to test what I did, after training the classifier:
import cv2 as cv
import os
cascade_faces = cv.CascadeClassifier('cascade/cascade.xml')
def process_image(file_name):
img = cv.imread(file_name)
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
rectangles = cascade_faces.detectMultiScale(gray, scaleFactor=1.05, minNeighbors=6)
green = (0, 255, 0)
for x, y, w, h in rectangles:
new_image = cv.rectangle(gray, (x,y), (x+w, y+h), green, 2)
cv.imshow("processed image", new_image)
cv.waitKey(0)
root_path = "path_to/group_images"
for path, directories, files in os.walk(root_path):
for file in files:
process_image(os.path.join(path, file))
The group images that I'm testing with I found through searching online for pictures of groups of people.
I've read or watched multiple tutorials and resources online, including stackoverflow (such as here and here), and I've spent time in the official opencv documentation.
I have done numerous variations of the attempt described above. This includes using the command line parameter -h 24 -w 24. I've changed the number of stages to train. I've changed the scale_factor and minNeighbors parameters of detectMultiScale. I've even used the vec file trainingfaces_24-24.vec that came with my installation of opencv. I've tried training with 2000 positive images and 2000 negative images. In a completely different set of attempts, I tried to train a cascade classifier to classify if an object is the side of a car or not, but those attempts also failed. (For the car recognition problem I used a clean data set found online that had positive and negative images.)
Also, it seems weird to me that training has usually taken less than 1 minute, since I've read in multiple places that training can, depending on the problem, sometimes take hours or days.
At the request of the comment below, here is the output to my terminal saying what happens in each stage. It did not make it to stage 8:
PARAMETERS:
cascadeDirName: cascade
vecFileName: pos.vec
bgFileName: github_negatives.txt
numPos: 500
numNeg: 2000
numStages: 8
precalcValBufSize[Mb] : 1024
precalcIdxBufSize[Mb] : 1024
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 20
sampleHeight: 20
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: BASIC
Number of unique features given windowSize [20,20] : 78460
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 500 : 500
NEG count : acceptanceRatio 2000 : 1
Precalculation time: 2.334
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 0.996| 0.24|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 6 seconds.
===== TRAINING 1-stage =====
<BEGIN
POS count : consumed 500 : 502
NEG count : acceptanceRatio 2000 : 0.262743
Precalculation time: 2.307
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 0.996| 0.4545|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 13 seconds.
===== TRAINING 2-stage =====
<BEGIN
POS count : consumed 500 : 504
NEG count : acceptanceRatio 2000 : 0.176663
Precalculation time: 2.287
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 0.996| 0.3065|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 19 seconds.
===== TRAINING 3-stage =====
<BEGIN
POS count : consumed 500 : 506
NEG count : acceptanceRatio 2000 : 0.0594283
Precalculation time: 2.263
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 0.998| 0.451|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 27 seconds.
===== TRAINING 4-stage =====
<BEGIN
POS count : consumed 500 : 507
NEG count : acceptanceRatio 2000 : 0.0728518
Precalculation time: 2.412
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 0.998| 0.707|
+----+---------+---------+
| 5| 0.998| 0.4|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 37 seconds.
===== TRAINING 5-stage =====
<BEGIN
POS count : consumed 500 : 508
NEG count : acceptanceRatio 2000 : 0.0236496
Precalculation time: 2.344
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 0.998| 0.699|
+----+---------+---------+
| 5| 0.996| 0.576|
+----+---------+---------+
| 6| 0.996| 0.468|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 48 seconds.
===== TRAINING 6-stage =====
<BEGIN
POS count : consumed 500 : 510
NEG count : acceptanceRatio 2000 : 0.00649212
Precalculation time: 1.877
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 1| 0.6505|
+----+---------+---------+
| 5| 1| 0.458|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 57 seconds.
===== TRAINING 7-stage =====
<BEGIN
POS count : consumed 500 : 510
NEG count : acceptanceRatio 53 : 0.00383391
Required leaf false alarm rate achieved. Branch training terminated.
Upvotes: 3
Views: 968
Reputation: 159
I was able to improve the face detector somewhat so that it doesn't fail as miserably.
First, I used more positive images and more negative images. I then used the following command to train the cascade:
"path_to\opencv_traincascade.exe" -data cascade -vec pos.vec -bg github_negatives.txt -w 20 -h 20 -numPos 6000 -numNeg 3000 -numStages 12 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -mode ALL -mem 2000
Also, for this attempt, I had used bounding boxes on the positive images that were smaller. (As stated in my question, doing this change by itself wasn't enough to help.)
In the following image, you can see there are far fewer false positives:
And here is another test image where it detected over half of the faces in the image:
I'm guessing that even more positive images would help. Also, I would guess that including negative images of people with the image cropped to only show below the neck would also help. Besides these two further changes, I'm not sure what other changes to make.
Upvotes: 0