leon
leon

Reputation: 1512

Text classification with Naive Bayes

I am leaning NLP and noticed that TextBlob classification based in Naive Bayes (textblob is Build on top of NLTK) https://textblob.readthedocs.io/en/dev/classifiers.html works fine when training data is list of sentences and does not work at all when training data are individual words (where each word and assigned classification).

Why?

Upvotes: 0

Views: 187

Answers (1)

Sorin
Sorin

Reputation: 11968

Because you don't have single words in the training data.

Usually the training and evaluation/testing data are supposed to be selected with identical distribution. Biases or skews are usually problematic. In very few cases you can train the model to do one thing and use it to do something else.

In your case, the model likely spreads the weights over the words in the sentence. So when you pick a single word, you only get a small portion of the weight represented.

To get it to work you should add single word examples to your training data.

Upvotes: 1

Related Questions