Reputation: 3508
I am using Tensorflow framework for my classification predictions. My dataset contains around 1160 output classes. The output class values are 6 digit number. For example, 789954. After training and testing the dataset with Tensorflow, I got the accuracy of around 99%.
Now the second step is to get the prediction outcome in the csv file so that I can check the predicted outcomes(logits) match with original labels in the set. We know that logits are one hot encoded vectors for my . So, I have done the following steps in order to decode the one hot encoded.
prediction=tf.argmax(logits,1)
print(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))
prediction = np.asarray(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))
prediction = np.reshape(prediction, (test_features.shape[0],1))
np.savetxt("prediction.csv", prediction, delimiter=",")
The resulted values in csv file is only 0.00E+00 for all entries. But my expectation is 6 digit codes for the respective csv entries. I guess I have gone somewhere wrong in my one-hot encoding.
Any help is appreciable.
Added : I have one hot encoded in this way.
labels = tf.one_hot(labels, n_classes)
And n_classes = 1160 and all the values will be 6 digit number
Upvotes: 1
Views: 1299
Reputation: 17201
If each description has only one-label
then your approach is fine. You use sklearn LabelEncoder
to convert your categories to labels. Your label should be for each feature a value between [0 to 1160]
and then do a on-hot encoding
.
Upvotes: 1