Reputation: 1000
I want to understand why is it necessary to normalize the posterior. If my undesrstanding is wrong about Naive Bayes theorem please correct me.
In the formula
P(B|A) = P(A|B)*P(B) / P(A)
RHS probabilities are computed from training data P(A|B) where A is input features and B is target class P(B) is probability of target class under consideration and P(A) is probability of Input features.
Once you have these prior probabilites computed you get the testing data and based on input features of test data you compute the target class probability which is P(B|A) (which I guess is called posterior probability).
Now in some videos they teach that after this you have to normalize P(B|A) to get the probability of that target class.
Why is that necessary. Did P(B|A) is itself not the probability of the target class ?
Upvotes: 1
Views: 1021
Reputation: 4629
The reason is quite simple:
In the Naive Bayes your objective is to find the class that maximize the posterior probability, so basically, you want the Class_j
that maximize this formula:
Because we have made assumptions of independence, we can translate the P(x|Class_j)
numerator part in this way:
Than the numerator in the formula can become something like that:
Because the denominator P(x) is the same for every class, you can basically omit this term in the maximum calculation:
But because the numerator alone does not represent your specific probability (omitting the P(x)), to obtain that you need to divide for that quantity.
Some used refs:
http://shatterline.com/blog/2013/09/12/not-so-naive-classification-with-the-naive-bayes-classifier/ https://www.globalsoftwaresupport.com/naive-bayes-classifier-explained-step-step/
Upvotes: 5