Reputation: 1587
The FaceNet algorithm (described in this article) uses a convolutional neural network to represent an image in an 128 dimensional Euclidean space.
While reading the article I didn't understand:
how are the triplets chosen?
2.1 . how do I know a negative image is hard
2.2 . why am I using the loss function to determine the negative image
2.3 . when do I check my images for hardness with respect to the anchor - I believe that is before I send a triplet to be processed by the network, right.
Upvotes: 4
Views: 728
Reputation: 17191
Here are some of the answer that may clarify your doubts:
Even here the weights are adjusted to minimise the Loss, its just the loss term is little complicated. The loss has two parts(separated by + in the equation), first part
is the image of a person compared to a different image of the same person. The second part
is the image of the person compared to a image of a different person. We want the first part
loss to be less than the second part
loss and the loss equation in essence captures that. So here you basically want to adjust the weights such that same person error
is less and different person error
is more.
The Loss term involves three images: The image in question(anchor): x_a
, its positive pair: x_p
and its negative pair: x_n
. An hardest positive
of x_a
is the positive image that has the biggest error compared to the rest of the positive images. The hardest negative
of x_a
is the closest image of a different person. So you want to bring the furthest positives to be close to each other and push the closest negatives further away. This is captured in the loss equation.
Facenet
calculates its anchor during training (online). In each minibatch
(which is a set of 40 images) they select the hardest negative
to the anchor and instead of choosing the hardest positive
image, they choose all anchor-positive pairs within the batch.
If you are looking to implement face recognition
, you should better consider this paper, that implements centre loss
, which is much easier to train and shown to perform better.
Upvotes: 3