Reputation: 1078
So, say I have a data set with photos of myself. And what I want to do is to train a neural network, so it would recognize whether it's me on the given image or not.
But to train a neural network I have to have at least 2 classes, so I must have the photos of myself (which I already have) and I also must have the photos of 'not myself', which I don't know what to do with.
So, what I want to know is, what are the photos of 'not me'? Are that just random photos that don't contain me because I've tried that, and it didn't work.
Now, I know that there are similar questions to mine on stackoverflow, but there are no answers to them, that would help me solve my problem.
Here is some code:
I use pretrained model for better image recognition:
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)
pre_trained_model.load_weights('img_model.h5')
for layer in pre_trained_model.layers:
layer.trainable = False
last_layer = pre_trained_model.get_layer('mixed7')
last_output = last_layer.output
and there is my model declaration:
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1, activation='sigmoid')(x)
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
and here I train my model:
history = model.fit(
train_generator,
validation_data = validation_generator,
epochs = 2,
verbose = 2)
and finally, I test the network by myself:
img = image.load_img('imgs/some_img_of_me.jpg', target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
classes = model.predict(x)
print(classes)
and this is the result of the above code:
[[0.]]
The problem is that the model always returns [[0.]]
, regardless of whether it is me on the given image or not.
So I would like to know:
Upvotes: 1
Views: 665
Reputation: 438
It is extremely difficult to do any image recognition without neural networks, so in that respect, you're doing it correctly.
However, for most image recognition problems, convolutional layers are a clever idea as they were originally created to model the neural pathways connected to the optical nerve. Also, it is worth checking how many of 'you' vs 'not you' you're feeding the neural network, as otherwise the network can often end up predicting only one class.
A good option for the 'not you' images would be to have photos of people who aren't you, in addition to a few of other random things.
Upvotes: 1
Reputation: 1630
First of all, face recognition is not handled as regular classification problem.
Initially, you might have ten thousands photos of thousands of identities. This is a regular classification solution. This means that your neural network should have thousands of output nodes. Each node in the output represent an identity. Suppose that you feed the image of Matt Damon. The output of that instance should be 1 for is_matt_damon node and other nodes should be 0. In this way, you will train your neural networks for ten thousand instances.
When the training is over, the final layer of the network will be dropped. In this way, early layers represent images. I mean that even if you didn't feed Leonardo Di Caprio's picture in training, neural network will return a series of output in the early layers. This is called representation.
When you feed two different photos of Di Caprio, then you will have two different representations. It is expected that these two representations should have a low distance. Similarly, if you feed Matt Damon and Di Caprio pair, representations should have a high distance.
Euclidean distance or cosine similarity could be adapted to find the distance between representations (or vectors).
Upvotes: 2