Reputation: 1
Suppose i have a spam-non spam email classifier. If a new email has been classified as a spam mail, how to determine the words in the mail mainly responsible for the classifier to classify it as SPAM.
For example, if a mail has the following text :
Get 10000 dollars free by clicking here.
The main words responsible for classifying the mail as SPAM are "10000 dollars free".
Upvotes: -1
Views: 205
Reputation: 1135
I'm going to answer this question assuming that you have used the Naive Bayes classifier for classification.
Naive Bayes classifier is a rather simple algorithm that has been successfully employed in the field of spam detection.
The naive bayes classifier is based on Conditional Probability and makes use of the following equation:
P (a|b) = P (b|a) * P (a) / P (b)
Suppose that there are two classes that a Naive Bayes classifier can classify a piece of text (email) into, spam and not spam.
The equation mentioned above applied to the task of spam detection can be translated as follows:
P (class | text) = P (text | class) * P (class) / P (text)
Since the text is made up of words, it can be represented as a combination of words. text -> w1, w2, ....., wn
This translates to,
P (class | w1, w2, ..., wn) = P (w1, w2, ..., wn | class) * P (class) / P (w1, w2, ..., wn)
Since the Naive Bayes classifier makes the Naive assumption that the words are conditionally independent of each other, this translates to:
P (class | w1, w2, ... , wn) = P (w1 | class) * P (w2 | class) * ... * P (wn | class) * P (class)
For all the classes ('spam' and 'not spam' in our example).
I dropped down the denominator since it will be common for all the probabilities.
Where, P (class) is the probability of a given class ('spam' and 'not spam'). Suppose, that you have 100 training examples of which 60 are spam and 40 are not spam, then the class probabilities of 'spam' and 'not spam' would be 0.6 and 0.4 respectively.
P (w | class) is the probability of a word given a class. In the naive bayes classifier you count the probability of each word in a given class.
Let's consider the example that you have given,
Get 10000 dollars free by clicking here.
The naive bayes classifier would have already calculated the probabilities of the words Get, dollars, free, by, clicking, here in your sentence in a given class (spam and not spam).
If the sentence was classified as spam, then you can find the words which contributed most to the sentence being spam by finding out their probabilities in both spam and not spam classes.
Here you can find a Simple Naive Bayes implementation applied to the task of spam detection in emails.
Upvotes: 1
Reputation: 66850
This fully depends on your model. However, I will give you a general, mathematical way, and then few practical solutions
Let us assume that your classifier is probabilistic in this sense that it provides you with supports of its decision (this includes neural networks, naive bayes, lda, logistic regression etc.)
f(x) = P(ham|x)
Then if you want to answer "which dimension (feature) in x alters my decision the most" all you have to do is analyze the gradient (gradient, being a vector of partial derivatives shows you which dimensions affect output the most), thus:
most_important_feature_if_it_is_classified_as_ham = arg max_i (grad_x[f])_i
and symmetricaly if it is spam then
most_important_feature_if_it_is_classified_as_spam = arg min_i (grad_x[f])_i
All you need is the ability to differentiate your model. This again is possible for many existing ones like neural nets, naive bayes, lda or logistic regression.
I list few more or less direct methods of computing the above for typical models
Upvotes: 1