mido
mido

Reputation: 69

Predicting unknown faces in face recognition

I am trying to implement a convolutional neural network to recognize faces. The issue is that I want to train on 10 classes, and be able to predict more than 10 classes at test time (e.g. 20 classes).

How could I do that without affecting the test accuracy rate of the recognition of the older file ? Because I get a low test accuracy and sometimes 0.


Here is my code.

batch_size = 16
patch_size = 5
depth = 16
num_hidden = 128
num_labels = 12
num_channels = 1 

def reformat(dataset, labels):
  dataset = dataset.reshape(
    (-1, IMAGE_SIZE_H, IMAGE_SIZE_W, num_channels)).astype(np.float32)


  labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)

def accuracy(predictions, labels):

  return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
          / predictions.shape[0])   


graph = tf.Graph()

with graph.as_default():

  # Input data.

  tf_train_dataset = tf.placeholder(
    tf.float32, shape=(batch_size, IMAGE_SIZE_H, IMAGE_SIZE_W, num_channels))
  print("tf_train_dataset",tf_train_dataset)
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  layer1_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, num_channels, depth], stddev=0.1))
  layer1_biases = tf.Variable(tf.zeros([depth]))

  layer2_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, depth, depth], stddev=0.1))
  layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))

  layer3_weights = tf.Variable(tf.truncated_normal(
      [IMAGE_SIZE_H // 16 * IMAGE_SIZE_W // 16 * depth, num_hidden], stddev=0.1))
  layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))

  layer4_weights = tf.Variable(tf.truncated_normal(
      [num_hidden, num_labels], stddev=0.1))

  layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))

    conv_1 = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
    hidden_1 = tf.nn.relu(conv_1 + layer1_biases)
    pool_1 = tf.nn.max_pool(hidden_1,ksize = [1,2,2,1], strides= [1,2,2,1],padding ='SAME' )
    conv_2 = tf.nn.conv2d(pool_1, layer2_weights, [1, 2, 2, 1], padding='SAME')
    hidden_2 = tf.nn.relu(conv_2 + layer2_biases)
    pool_2 = tf.nn.max_pool(hidden_2,ksize = [1,2,2,1], strides= [1,2,2,1],padding ='SAME' )

    shape = pool_2.get_shape().as_list()
    reshape = tf.reshape(pool_2, [shape[0], shape[1] * shape[2] * shape[3]])
    hidden_3 = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
    return tf.matmul(hidden_3, layer4_weights) + layer4_biases

  # Training computation.

logits = model(tf_train_dataset)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits,tf_train_labels))


  # Optimizer.

optimizer = tf.train.GradientDescentOptimizer(0.001).minimize(loss)

  # Predictions for the training, validation, and test data.

train_prediction = tf.nn.softmax(logits)
valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
test_prediction = tf.nn.softmax(model(tf_test_dataset))

num_steps = 201

with tf.Session(graph=graph) as session:

    tf.initialize_all_variables().run()  
    print('Initialized')
    for step in range(num_steps):
      offset = (step * batch_size) % (train_labels.shape[0] - batch_size) 
      batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
      batch_labels = train_labels[offset:(offset + batch_size), :]
      feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
      _, l, predictions = session.run(
        [optimizer, loss, train_prediction ], feed_dict=feed_dict)
      if (step % 50 == 0):
        print('Minibatch loss at step %d: %f' % (step, l))
        print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels)
        print('Validation accuracy: %.1f%%' % accuracy(
          valid_prediction.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels[:,0:9]))

Upvotes: 0

Views: 1169

Answers (1)

Olivier Moindrot
Olivier Moindrot

Reputation: 28198

It is not possible to train a model with cross entropy on a certain number of classes (10 identities for you), and test it with different classes (e.g. the 10 training identities with 10 new ones, for a total of 20).

You would need anyway to get rid of the last softmax layer (of size [num_hidden, 10]) to use a new untrained softmax layer of size [num_hidden, 20].

The issue is that this new layer will be randomly initialized and will yield very bad results.


The general solution to deal with unknown classes in deep learning is to construct a very good representation of each input data (the face) into a feature space (of size num_hidden). This technique is called representation learning.

Imagine you have an excellent feature space where your model sends faces. In theory, all the faces of an identity will be sent to the same location, in a clean cluster. You will then be able to run k-means to get the identity clusters or any algorithm on top of that embedding, with k being the number of test identities (k=20).


There are multiple ways to get a good embedding. You can get the last hidden layer before your softmax, or even the last hidden layer of an already trained model (VGGFace has very good results and are freely available).

Another interesting idea also developed in the VGGFace paper is to use triplet loss to finetune the embedding.

Upvotes: 1

Related Questions