Wiwik Setyaningsih
Wiwik Setyaningsih

Reputation: 41

How Information Gain Works in Text Classification

I have to learn information gain for feature selection right now, But I don't have clear comprehension about it. I am a newbie, and I'm confused about it.

How to use IG in feature selection (manual calculation)?

I just have clue this .. That have anyone can help me how to use the formula:

enter image description here

then this is the example:

enter image description here

Upvotes: 4

Views: 6331

Answers (2)

Parag
Parag

Reputation: 680

The formula comes from mutual information, in this case, you can think of mutual information as how much information the presence of the term t gives us for guessing the class.

enter image description here

Check: https://nlp.stanford.edu/IR-book/html/htmledition/mutual-information-1.html

Upvotes: 0

Wasi Ahmad
Wasi Ahmad

Reputation: 37741

How to use information gain in feature selection?

Information gain (InfoGain(t)) measures the number of bits of information obtained for prediction of a class (c) by knowing the presence or absence of a term (t) in a document.

Concisely, the information gain is a measure of the reduction in entropy of the class variable after the value for the feature is observed. In other words, information gain for classification is a measure of how common a feature is in a particular class compared to how common it is in all other classes.

In text classification, feature means the terms appeared in documents (a.k.a corpus). Consider, two terms in the corpus - term1 and term2. If term1 is reducing entropy of the class variable by a larger value than term2, then term1 is more useful than term2 for document classification in this example.

Example in the context of sentiment classification

A word that occurs primarily in positive movie reviews and rarely in negative reviews contains high information. For example, the presence of the word “magnificent” in a movie review is a strong indicator that the review is positive. That makes “magnificent” a high informative word.

Compute entropy and information gain in python

Upvotes: 2

Related Questions