ymeng
ymeng

Reputation: 123

Using tensorflow layers, model not trained

I used keras on top of Theano before, and now want to write code in tensorflow style which is new to me. I tried writing a very simple model (which I implemented on keras and it worked), but the training process does not seem to work. The model always makes the same predictions no matter how many epochs I go trough, which shows the model is not updated in training process at all. I think I must have misunderstood something and made a stupid mistake, but cannot find where it is.

I am sure the input data and labels are correct, because I used them before. Input data training_input[0] and training_input[1] are 2D numpy arrays, respectively. Labels are one-hot with 4 dimensions.

def model_1(features, labels):
    hl_input = features['hl_input']
    bd_input = features['bd_input']
    encoder = tf.concat([hl_input, bd_input], axis=1)

    encoder = tf.layers.dense(encoder, 128, activation=tf.nn.relu)
    decoder = tf.layers.dense(encoder, 64)
    logits = tf.layers.dense(decoder, 4, activation=tf.nn.softmax)
    predictions = tf.argmax(logits, 1, name="predictions")

    loss = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
    train_op = tf.contrib.layers.optimize_loss(loss, tf.contrib.framework.get_global_step(), optimizer='Adam',
                                           learning_rate=0.1)
    predictions = {"classes": predictions, "probabilities": logits}

    return predictions, loss, train_op
... ...
classifier = tf.contrib.learn.Estimator(model_fn=model_1)
classifier.fit(x={'hl_input':training_input[0], 'bd_input':training_input[1]}, y=training_labels, batch_size=batch_size, steps=steps)

Upvotes: 1

Views: 292

Answers (1)

Vijay Mariappan
Vijay Mariappan

Reputation: 17201

You are applying softmax activation on the final layer twice. The tf.losses.softmax_cross_entropy function applies softmax internally, so remove the activation on the logits by setting activation=None.

Upvotes: 1

Related Questions