Reputation: 3
import numpy as np
from sklearn.naive_bayes import MultinomialNB
X = np.array([[0.25, 0.73], [0.12, 0.42], [0.53, 0.92], [0.11, 0.32]])
y = np.array([0, 0, 0, 1])
mnb = MultinomialNB()
mnb.fit(X, y)
mnb.predict([[0.11, 0.32]])
--> it predicts 0
Shouldn't it predict 1?
Upvotes: 0
Views: 841
Reputation: 33197
It depends. Here, you are using only one sample that belongs to class 1 during the fitting/training. Also, you have only 4 features for each sample and only 4 samples so again, the training will be poor.
import numpy as np
from sklearn.naive_bayes import MultinomialNB
X = np.array([[0.25, 0.73], [0.12, 0.42], [0.53, 0.92], [0.11, 0.32]])
y = np.array([0, 0, 0, 1])
mnb = MultinomialNB()
mnb.fit(X, y)
mnb.predict([[0.11, 0.32]])
array([0])
mnb.predict([[0.25, 0.73]])
array([0])
The model learns the rule and can successfully predict class 0 but not class 1. This is also known as the trade-off between specificity and sensitivity. We also refer to that by saying that the model can not generalize the rule.
Upvotes: 0
Reputation: 7170
Not necessarily. You can't assume that just because a model has seen an observation it will predict the corresponding label correctly. This is especially true in a high bias algorithm like Naive Bayes. High bias models tend to oversimplify the relationship between your X
and y
, and what you're seeing here is a product of that. On top of that, you only fit 4 samples, which is far too few for a model to learn a robust relationship.
If you're curious how exactly the model is creating these predictions, Multinomial Naive Bayes learns the joint log likelihoods of each class. You can actually compute those likelihoods using your fitted model:
>>> jll = mnb._joint_log_likelihood(X)
>>> jll
array([[-0.87974542, -2.02766662],
[-0.60540174, -1.73662711],
[-1.24051492, -2.36300468],
[-0.54761186, -1.66776584]])
From there, the predict
stage takes the argmax
of the classes, which is where the class label prediction comes from:
>>> mnb.classes_[np.argmax(jll, axis=1)]
array([0, 0, 0, 0])
You can see that as it currently stands, the model will predict 0
for all of the samples you've provided.
Upvotes: 1