Reputation: 2122
I was trying to solve the Dogs vs. Cats Redux: Kernels Edition problem on Kaggle. It is a simple image classification problem. However, I am doing worse than a random predictor with a score of 17+. Does anyone know why this might be?
Neural Network Model
def convolutional_neural_network():
weights = {
# 3x3x3 conv => 1x1x8
'conv1': tf.Variable(tf.random_normal([3, 3, 3, 8])),
# 5x5x8 conv => 1x1x16
'conv2': tf.Variable(tf.random_normal([5, 5, 8, 16])),
# 3x3x16 conv => 1x1x32
'conv3': tf.Variable(tf.random_normal([3, 3, 16, 32])),
# 32 FC => output_features
'out': tf.Variable(tf.random_normal([(SIZE//16)*(SIZE//16)*32, output_features]))
}
biases = {
'conv1': tf.Variable(tf.random_normal([8])),
'conv2': tf.Variable(tf.random_normal([16])),
'conv3': tf.Variable(tf.random_normal([32])),
'out': tf.Variable(tf.random_normal([output_features]))
}
conv1 = tf.add(conv2d(input_placeholder, weights['conv1'], 1), biases['conv1'])
relu1 = relu(conv1)
pool1 = maxpool2d(relu1, 4)
conv2 = tf.add(conv2d(pool1, weights['conv2'], 1), biases['conv2'])
relu2 = relu(conv2)
pool2 = maxpool2d(relu2, 2)
conv3 = tf.add(conv2d(pool2, weights['conv3'], 1), biases['conv3'])
relu3 = relu(conv3)
pool3 = maxpool2d(relu3, 2)
pool3 = tf.reshape(pool3 , shape=[-1, (SIZE//16)*(SIZE//16)*32])
output = tf.add(tf.matmul(pool3, weights['out']), biases['out'])
return output
The the output has no activation function.
Prediction, Optimizer and Loss Function
output_prediction = convolutional_neural_network()
loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(output_prediction, output_placeholder) )
trainer = tf.train.AdamOptimizer()
optimizer = trainer.minimize(loss)
test_prediction = tf.nn.softmax(output_prediction)
The images are converted into an numpy array of size 128x128x3 and fed into the neural network with a batch size of 64.
Edit : Ran the same code for 200 epochs. No improvement. I did Slightly worse.
Upvotes: 1
Views: 1527
Reputation: 424
This is more of a comment but not enough privilege points for that:
Did you normalize your data (i.e. divide the pixels values by 255)? I can't you see doing that in the script.
When you get terrible results like 17 logloss that means your model is always predicting one class with 100% confidence. Usually in this case it's not the architecture or learning rate or number of epochs but rather some silly mistake like forgetting to normalize or mixing up your labels. For this particular problem and given your architecture you should see an accuracy of about 80% and 0.4 logloss within 40 number of epochs. No need for thousands of epochs :)
Upvotes: 3
Reputation: 21
Improving accuracy is an art than one task solution, you can try some of these methods:
Upvotes: 1