Reputation: 123
I used keras on top of Theano before, and now want to write code in tensorflow style which is new to me. I tried writing a very simple model (which I implemented on keras and it worked), but the training process does not seem to work. The model always makes the same predictions no matter how many epochs I go trough, which shows the model is not updated in training process at all. I think I must have misunderstood something and made a stupid mistake, but cannot find where it is.
I am sure the input data and labels are correct, because I used them before. Input data training_input[0] and training_input[1] are 2D numpy arrays, respectively. Labels are one-hot with 4 dimensions.
def model_1(features, labels):
hl_input = features['hl_input']
bd_input = features['bd_input']
encoder = tf.concat([hl_input, bd_input], axis=1)
encoder = tf.layers.dense(encoder, 128, activation=tf.nn.relu)
decoder = tf.layers.dense(encoder, 64)
logits = tf.layers.dense(decoder, 4, activation=tf.nn.softmax)
predictions = tf.argmax(logits, 1, name="predictions")
loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
train_op = tf.contrib.layers.optimize_loss(loss, tf.contrib.framework.get_global_step(), optimizer='Adam',
learning_rate=0.1)
predictions = {"classes": predictions, "probabilities": logits}
return predictions, loss, train_op
... ...
classifier = tf.contrib.learn.Estimator(model_fn=model_1)
classifier.fit(x={'hl_input':training_input[0], 'bd_input':training_input[1]}, y=training_labels, batch_size=batch_size, steps=steps)
Upvotes: 1
Views: 292
Reputation: 17201
You are applying softmax
activation on the final layer twice. The tf.losses.softmax_cross_entropy
function applies softmax
internally, so remove the activation on the logits
by setting activation=None
.
Upvotes: 1