Reputation: 33
Let me preface this by saying, I am very new to neural networks, and this is my first time using numpy, tensorflow, or keras.
I wrote a neural network to recognize handwritten digits, using the MNIST data set. I followed this tutorial by Sentdex and noticed he was using print(np.argmax(predictions[0]))
to print the first index from the numpy array of predictions.
I tried running the program with that line replaced by print(predictions[i])
, (i was set to 0) but the output was not a number, it was:
[2.1975785e-08 1.8658861e-08 2.8842608e-06 5.7113186e-05 1.2067199e-10
7.2511304e-09 1.6282028e-12 9.9993789e-01 1.3356166e-08 2.0409643e-06]
.
My code than I'm confused about is:
predictions = model.predict(x_test)
for i in range(10):
plt.imshow(x_test[i])
plt.show()
print("PREDICTION: ", predictions[i])
I read the numpy documentation for the argmax() function, and from what I understand, it takes in a x-dimensional array, converts it to a one-dimensional array, then returns the index of the largest value. The Keras documentation for model.predict() indicated that the function returns a numpy array of the networks predictions. So I don't understand why we have to use argmax() to properly print the prediction, because as I understand, it has a completely unrelated purpose.
Sorry for the bad code formatting, I couldn't figure out how to properly insert multi line chunks of code into my post
Upvotes: 2
Views: 3422
Reputation: 56377
What any classification neural network outputs is a probability distribution over the class indices, meaning that the network assigns one probability to each class. The sum of these probabilities is 1.0. Then the network is trained to assign the highest probability to the correct class, so to recover the class index from the probabilities you have to take the location (index) that has the maximum probability. This is done with the argmax
operation.
Upvotes: 2
Reputation: 2331
If i understand well your question, the answer is pretty simple :
I hope i'm clear ahah
Upvotes: 1