Lin
Lin

Reputation: 1

How to select only the best features by setting up the threshold using FSelector information gain in R language?

I have done information gain feature selection in R by using FSelector package in R

install.packages("RWekajars")
install.packages("FSelector")
library(FSelector)

weights <- information.gain(Classname~., df)

Attributes                                          attr_importance
X.1                                              3.6349780
X                                                3.6349780
Value_1                                          3.7128973
Value_1                                          0.9652070
Item_1                                           2.0845525

Now, I need select best features out of this based on the attr_importance. How to select the best features in R based on the threshold values and how to set the threshold value?

Upvotes: 0

Views: 649

Answers (1)

Dan
Dan

Reputation: 513

There is a method cutoff.k from the Package FSelector that solves your problem:

  • cutoff.k chooses k best attributes
  • cutoff.k.percent chooses best k * 100% of attributes
  • cutoff.biggest.diff chooses a subset of attributes which are significantly better than other.

Eg: results <- cutoff.k.percent(weights, 0.9) will return all Attributes until 0.9 is reached. Or: results <- cutoff.k(weights, 2) will return the 2 Attributes with most information gain. Dos this solve your problem?

Upvotes: 0

Related Questions