Minoru
Minoru

Reputation: 1730

R - convert from categorical to numeric for KNN

I'm trying to use the Caret package of R to use the KNN applied to the "abalone" database from UCI Machine Learning (link to the data). But it doesn't allow to use KNN when there's categorical values. How do I convert the categorical values (in this database: "M","F","I") to numeric values, such as 1,2,3, respectively?

Upvotes: 1

Views: 19650

Answers (5)

user11139847
user11139847

Reputation:

Try using knncat package in R, which converts categorical variables into numerical counterpart.

Here's the link for the package

Upvotes: 0

Lohith Arcot
Lohith Arcot

Reputation: 1186

You can simply read the file with stringsAsFactors = TRUE

Example

data_raw<-read.csv('...../credit-default.csv', stringsAsFactors = TRUE)

The stringasfactors will give a numerical replacement for the Char datatypes

Upvotes: 0

Amir
Amir

Reputation: 33

One of easiest way to use kNN algorithm in your dataset in which one of its feature is categorical : "M", "F" and "I" as you mentioned is as follows: Just in your CVS or Excel file that your dataset exsits, go ahead in the right column and change M to 1 and F to 2 and I to 3. In this case you have discrete value in your dataset and you can easily use kNN algorithm using R.

Upvotes: 1

topepo
topepo

Reputation: 14316

The first answer seems like a really bad idea. Coding {"M","F","I"} to {1, 2, 3} implies that Infant = 3 * Male, Male = Female/2 and so on.

KNN via caret does allow categorical values as predictors if you use the formula methods. Otherwise you need to encode them as binary dummy variables.

Also, showing your code and having a reproducible example would help a lot.

Max

Upvotes: 16

Dieter Menne
Dieter Menne

Reputation: 10215

When data are read in via read.table, the data in the first column are factors. Then

data$iGender = as.integer(data$Gender) 

would work. If they are character, a detour via factor is easiest:

data$iGender= as.integer(as.factor(data$Gender))

Upvotes: 3

Related Questions