Emerson Hsieh
Emerson Hsieh

Reputation: 233

MultiClass Keras Classifier prediction output meaning

I have a Keras classifier built using the Keras wrapper of the Scikit-Learn API. The neural network has 10 output nodes, and the training data is all represented using one-hot encoding.

According to Tensorflow documentation, the predict function outputs a shape of (n_samples,). When I fitted 514541 samples, the function returned an array with shape (514541, ), and each entry of the array ranged from 0 to 9.

Since I have ten different outputs, does the numerical value of each entry correspond exactly to the result that I encoded in my training matrix?

i.e. if index 5 of my one-hot encoding of y_train represents "orange", does a prediction value of 5 mean that the neural network predicted "orange"?

Here is a sample of my model:

model = Sequential()
model.add(Dropout(0.2, input_shape=(32,) ))

model.add(Dense(21, activation='selu'))
model.add(Dropout(0.5))

model.add(Dense(10, activation='softmax'))

Upvotes: 1

Views: 2937

Answers (1)

desertnaut
desertnaut

Reputation: 60318

There are some issues with your question.

The neural network has 10 output nodes, and the training data is all represented using one-hot encoding.

Since your network has 10 output nodes, and your labels are one-hot encoded, your model's output should also be 10-dimensional, and again hot-encoded, i.e. of shape (n_samples, 10). Moreover, since you use a softmax activation for your final layer, each element of your 10-dimensional output should be in [0, 1], and interpreted as the probability of the output belonging to the respective (one-hot encoded) class.

According to Tensorflow documentation, the predict function outputs a shape of (n_samples,).

It's puzzling why you refer to Tensorflow, while your model is clearly a Keras one; you should refer to the predict method of the Keras sequential API.

When I fitted 514541 samples, the function returned an array with shape (514541, ), and each entry of the array ranged from 0 to 9.

If something like that happens, it must be due to a later part in your code that you do not show here; in any case, the idea would be to find the argument with the highest value from each 10-dimensional network output (since they are interpreted as probabilities, it is intuitive that the element with the highest value would be the most probable). In other words, somewhere in your code there must be something like this:

pred = model.predict(x_test)
y = np.argmax(pred, axis=1) # numpy must have been imported as np

which will give an array of shape (n_samples,), with each y an integer between 0 and 9, as you report.

i.e. if index 5 of my one-hot encoding of y_train represents "orange", does a prediction value of 5 mean that the neural network predicted "orange"?

Provided that the above hold, yes.

Upvotes: 5

Related Questions