Reputation: 383
My LSTM RNN has to predict a single letter(Y), given preceding words before(X). For example, if "Oh, say! can you see by the dawn's early ligh" is given as X, then Y would be "t"(part of National Anthem). Each Alpabets are one-hot coded. So, g in one-hot coded is for example, [0,0,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0].
dataX:[batch_size,20,num_of_classes], dataY:[batch_size,1,num_of_classes]
In this case, what loss function would be best for prediction? Both X and Y are one-hot encoded, X are many and Y is one. I rarely find loss functions which takes one-hot as parameter(such as, parameter for logits or target).
Upvotes: 3
Views: 2708
Reputation: 3773
What you are looking for is the cross entropy between
Y_ (ground truth) and Y (probabilities)
You could use a basic hand coded cross entropy like
y = tf.nn.softmax( logit_layer )
loss = -tf.reduce_mean(tf.reduce_mean( y_ * tf.log(y) ))
Or you could use the built in TensorFlow function
loss = tf.nn.softmax_cross_entropy_with_logits( labels=y_, logits=logit_layer)
Your Y output would be something like [0.01,0.02,0.01,.98,0.02,...] and your logit_layer is just the raw output before applying softmax.
Upvotes: 1