Pritam
Pritam

Reputation: 31

Training Naive Bayes Classifier

I am developing a naive bayes classifier using simple bag of words concept. My question is in naive bayes or in any other machine learning senario 'training' the classifier is an important matter. But how to train naive bayes classifier when I already have a bag_of_words of various classes.

Upvotes: 1

Views: 941

Answers (2)

Logan Dillard
Logan Dillard

Reputation: 106

The Stanford IR book gives a good explanation of how Naive Bayes classifiers work, and they use text classification as their example. The Wikipedia article also gives a detailed description of the theory and some concrete examples.

In a nutshell, you count the occurrences of each word type within each class, and then normalize by the number of documents to get the probability of word given class p(w|c). You then use Bayes' rule to get the probability of each class given the document p(c|doc) = p(c)*p(doc|c), where the probability of the document given the class is the product of the probabilities of its words given the class p(doc|c) = Π(w in doc) p(w|c). These probabilities get very small before normalizing between the classes, so you may want to take the logarithm and sum them to avoid underflow errors.

Upvotes: 0

miraculixx
miraculixx

Reputation: 10349

how to train naive bayes classifier when I already have a bag_of_words of various classes.

In general, what you do is this:

  1. split your bag of words into two random subsets, call one training the other test
  2. train the classifier on the training subset
  3. validate the classifier's accuracy by running it against the test subset

'training' the classifier is an important matter

indeed -- that's how your classifier learns to separate words from different classes.

Upvotes: 1

Related Questions