Reputation: 373
Imagining I have a data set, whose feature values are continuous, and there are more than two possible labels (eg: rain, sunny, windy etc), which naive bayes model should I implement in sklearn?
I am thinking about Gaussian or Multinomial. However, multinomial works for discrete features, and I tried gaussian, but it turns out that the accuracy of the prediction is like random selecting.
Upvotes: 0
Views: 567
Reputation: 1
Usually when your data is continuous, you will apply Gaussian naive Bayes or you can transform your data into a discrete format where your temperature values are converted to (ex.low, medium, high).
The outcome of your Gaussian model should not be equating to random selection, there is probably something wrong with the model or the data.
Some things to check before you apply Gaussian model:
Upvotes: 0
Reputation: 81
Naive Bayes Classification (NBC) works with discrete values. That means you have to discretize all features which are continuous. For more details, this could help
Anyways, multinominal is correct because you have more than one label. But you should also keep in mind that you have to one-hot encode your labels (OneHotEncoder in sklearn).
Upvotes: 1