Reputation: 57
I'm using a very simple NN with a normalized word2vec as input. When running my train (based on the mini batch) the train cost start around 1020 and decrease around 1000 but never less than this and my accuracy is around 50%.
Why doesn't the cost decrease ? How can I verify that the weigth matrice is updated at each run?
apply_weights_OP = tf.matmul(X, weights, name="apply_weights")
add_bias_OP = tf.add(apply_weights_OP, bias, name="add_bias")
activation_OP = tf.nn.sigmoid(add_bias_OP, name="activation")
cost_OP = tf.nn.l2_loss(activation_OP-yGold, name="squared_error_cost")
optimizer = tf.train.AdamOptimizer(0.001)
global_step = tf.Variable(0, name='global_step', trainable=False)
training_OP = optimizer.minimize(cost_OP, global_step=global_step)
correct_predictions_OP = tf.equal(
tf.argmax(activation_OP, 0),
tf.argmax(yGold, 0)
)
accuracy_OP = tf.reduce_mean(tf.cast(correct_predictions_OP, "float"))
newCost, train_accuracy, _ = sess.run(
[cost_OP, accuracy_OP, training_OP],
feed_dict={
X: trainX[indice_bas: indice_haut],
yGold: trainY[indice_bas: indice_haut]
}
)
Thanks
Upvotes: 1
Views: 1286
Reputation: 176
try using cross entropy instead of the L2 loss, also there is no real point in having an activation function on your output layer.
The examples that ship with tensorflow actually have a basic model that is very similar to what you are trying.
btw: it might also be that the problem you are trying to learn is simply not solvable by a simple linear model (i.e. what you are trying to do), try using a deeper model. Here is an example of a 2 layer deep multilayer perceptron.
Upvotes: 1