Reputation: 41
I am trying to code a face-recognition program in Python (I am going to apply k-nn algorithm to classify).
First of all, I converted the images into greyscale and then I created a long column vector (by using Opencv's imagedata function) with the image's pixels (128x128= 16384 features total)
So I got a dataset like the following (last column is the class label and I only showed first 7 features of the dataset instead of 16384).
176, 176, 175, 175, 177, 173, 178, 1
162, 161, 167, 162, 167, 166, 166, 2
But when I apply k-nn to this dataset, I get awkward results. Do I need to apply additional processes to this dataset instead of just converting the image to pixel representation?
Thanks.
Upvotes: 4
Views: 2683
Reputation: 21
Usually, a face recognition pipeline needs several stages in order to be effective. Some degree of geometric normalization is critical to accuracy. You either need to manually label fiducial points and acquire a transform for each image, or automatically detect fiducial points, for which there are open source fiducial point detectors. Try opencv's getAffineTransform function. Also, lighting discrepancies can cause huge problems. You might try lighting normalization techniques (e.g., self quotient image), as they work pretty well for diffuse reflection and shadows (not so much specular reflection). For dimensionality reduction, principal components analysis (PCA) or linear discriminant analysis (LDA) are good places to start. Rather than raw pixel features, though, you might consider more meaningful features like LBP, HOG, or SIFT. Also, you will be able to attain higher accuracy than KNN with more sophisticated (although more complicated) classifiers such as SVM's.
Upvotes: 2
Reputation: 12142
You will probably need the eyes, tip of nose and mouth aligned.
You will probably also need a more sophisticated image representation method. For example, direction of gradient and self quotient image would be good starting points.
Upvotes: 1
Reputation: 61044
If you want it to work well, yes, you need to do a feature transformation.
PCA or LDA work well. PCA will take a collection of input vectors (in this case, your vectorized images) and find the Eigenfaces that span the set of inputs. Then, during testing, you project your input vector (i.e., image) onto this set of Eigenfaces and use the resulting coordinate vector as your feature vector. For more information, see [Turk and Pentland, 1991].
My personal experiments using this basic PCA method on the PIE database were successful.
Upvotes: 1
Reputation: 3305
How do you print this? Did you try using the reshape function? It converts 2D images into 1D images with/without multiple channels.
Also, an image's pixels aren't features. You can have a lot of different objects behind the face - curtains, books, other faces, etc. Things like boundary of the face, distance between eyes, etc are more invariant to such things.
Upvotes: 0