Gospel77
Gospel77

Reputation: 153

In Weka, what's the difference between the attribute filters - Discretize and NumericToNominal?

I'm using nominal independent variables such as 'gender', 'education_level', 'martial_status, and a nominal dependent variable - 'True_or_false'.

I have created an ARFF file with attributes labelled with their datatype. In the case of nominal attributes, I have also listed the meanings of their number assignments.

I would like to not only know the correct (Discrete or NumericToNominal) filters to use for such variables but also how these filters differ.

Upvotes: 0

Views: 172

Answers (1)

fracpete
fracpete

Reputation: 2608

The Discretize filters (the supervised version also takes the class attribute into account) turn a continuous vatiable (e.g., distance_travelled) into bins, based on the distribution of values (check the synopsis of each filter for details).

Thr NumericToNominal filter is for situations where a categorical variable (e.g., mode_of_transport like car/bike/bus represented by 0/1/2) got interpeted incorrectly as a numeric one (a value of 1.5 makes no sense in such cases). This can happen during CSV imports or similar ones where there id no meta-data about the data type per column available. This filter simply turns the numbers it encounters into string labels.

Upvotes: 1

Related Questions