Reputation: 95
I am doing a simple binary text classification. The steps go roughly like this:
I am stuck on step 5 - when I print the predicted values, I get a numpy array of:
[0.9434484 ]
[0.3787447 ]
...
[0.87870705]
[0.7575223 ]
[0.39714795]]
Since I am doing a binary classification, and my labels are 0 and 1, I expected the prediction output to be the same? Now it seems like it predicts the probability between the labels 0 and 1, which is not what I wanted. Do I need to encode the prediction output somehow so that it returns the proper labels or have I done something wrong in the steps before??
Upvotes: 3
Views: 2194
Reputation: 2790
The step 5 model.predict(x_test)
can be replaced by:
model.predict_classes(x_test)
to predict classes in sequential model. In case you ever need this in functional model in future, this is the solution:
y_prob = model.predict(x_test)
y_classes = y_prob.argmax(axis=-1)
Upvotes: 1
Reputation: 4184
One solution is to use simple statistical interpretation where we will be using 0.5 cutoff. Thus everything above 0.5 will be treated as 1 and below as 0.
import numpy as np
pred = np.array([[0.9434484 ]
,[0.3787447 ]
,[0.87870705]
,[0.7575223 ]
,[0.39714795]])
np.round(pred)
Out[37]:
array([[1.],
[0.],
[1.],
[1.],
[0.]])
If results is not a probabilities then sth like :
def sigmoid(x):
return 1 / (1 + math.exp(-x))
have to be used to scale it to 0-1 scale.
Upvotes: 3