Reputation: 41
I'm trying to make my first tensorflow model, however I have some issues. It seems that it makes the train correctly, but when it does a prediction it just returns (almost) always the same value. Here's the code:
n_classes = 2
tf.reset_default_graph()
x = tf.placeholder('float')
y = tf.placeholder('float')
keep_rate = tf.placeholder(tf.float32)
weights = {'W_conv1':tf.Variable(tf.random_normal([3,3,3,1,32]),
'W_conv2':tf.Variable(tf.random_normal([3,3,3,32,64])),
'W_fc':tf.Variable(tf.random_normal([54080,1024])),
'out':tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {'b_conv1':tf.Variable(tf.random_normal([32])),
'b_conv2':tf.Variable(tf.random_normal([64])),
'b_fc':tf.Variable(tf.random_normal([1024])),
'out':tf.Variable(tf.random_normal([n_classes]))}
def conv3d(x, W):
return tf.nn.conv3d(x, W, strides=[1,1,1,1,1], padding='SAME')
def maxpool3d(x):
return tf.nn.max_pool3d(x, ksize=[1,2,2,2,1], strides=[1,2,2,2,1], padding='SAME')
def convolutional_neural_network(x, keep_rate):
x = tf.reshape(x, shape=[-1, IMG_SIZE_PX, IMG_SIZE_PX, SLICE_COUNT, 1])
conv1 = tf.nn.relu(conv3d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool3d(conv1)
conv2 = tf.nn.relu(conv3d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool3d(conv2)
fc = tf.reshape(conv2,[-1, 54080])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc'])+biases['b_fc'])
fc = tf.nn.dropout(fc, keep_rate)
output = tf.matmul(fc, weights['out'])+biases['out']
return output
much_data = np.load('F:/Kaggle/Data Science Bowl 2017/Script/muchdata-50-50-20.npy')
train_data = much_data[:-100]
validation_data = much_data[-100:]
def train_neural_network(x):
prediction = convolutional_neural_network(x, keep_rate)
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y) )
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(cost)
hm_epochs = 10
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
epoch_loss = 0
for data in train_data:
X = data[0]
Y = data[1]
_, c = sess.run([optimizer, cost], feed_dict={x: X, y: Y, keep_rate: 0.75})
epoch_loss += c
print('Epoch', epoch+1, 'completed out of',hm_epochs,'loss:',epoch_loss)
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data], keep_rate: 1.}))
print('Done. Finishing accuracy:')
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data], keep_rate: 1.}))
eval_data = np.load('F:/Kaggle/Data Science Bowl 2017/Script/eval_data-50-50-20.npy')
probabilities = tf.nn.softmax(prediction)
sol = []
for data in eval_data:
X = data[0]
id = data[1]
probs = probabilities.eval(feed_dict={x: X, keep_rate: 1.})
pred = prediction.eval(feed_dict={x: X, keep_rate: 1.})
print('Outputs: ',pred)
print('Probs: ',probs)
sol.append([id, probs[0,1]])
print(sol)
I have also checked the predictions during the training of the model and, if I set keep_rate to 1, I also get almost always constant predictions towards the end. In the first epochs there is a lot of variation, but in the last epochs it seems the neural net is always predicting the same for every image. It seems it converges to a unique prediction value, without taking into account what image I pass to the neural net. I checked hundred times but can't see where the mistake is.
This is an example of what I get for some images in eval_data (same behaviour when I print for train_data):
Probs: [[ 0.76099759 0.23900245]]
Outputs: [[-0.017277 -1.1754334]]
Probs: [[ 0.76099759 0.23900245]]
Outputs: [[-0.017277 -1.1754334]]
Probs: [[ 0.76099759 0.23900245]]
Outputs: [[ 117714.1953125 -47536.32421875]]
Probs: [[ 1. 0.]]
Outputs: [[-0.017277 -1.1754334]]
Probs: [[ 0.76099759 0.23900245]]
Outputs: [[-0.017277 -1.1754334]]
Probs: [[ 0.76099759 0.23900245]]
Outputs: [[-0.017277 -1.1754334]]
Probs: [[ 0.76099759 0.23900245]]
Notice that they are almost always the same, but from time to time I see some bizarre value like
Outputs: [[ 117714.1953125 -47536.32421875]]
Probs: [[ 1. 0.]]
How to resolve this?
Upvotes: 4
Views: 6446
Reputation: 51
I've got the same problem and it took me two weeks to find the reason. It might help you. My problem is due to the noisy dataset and high learning rate. Since ReLU activation could kill the neuron, when the dataset is noisy, most of the ReLUs will dead (not activate to any input, because it considers its input as useless), then the network might only learn some fixed distribution of final labels. So the results are fixed to any input.
My solution is using tf.nn.leaky_relu()
, because it will not kill the Negative inputs.
Upvotes: 5