noob
noob

Reputation: 3811

For loop to convert multiple columns to factors in R

I have few columns which I need to convert to factors

for cols in ['col1','col2']:
  df$cols<-as.factor(as.character(df$cols))

Error

for cols in ['col1','col2']:
Error: unexpected symbol in "for cols"
>   df$cols<-as.factor(as.character(df$cols))
Error in `$<-.data.frame`(`*tmp*`, cols, value = integer(0)) : 
  replacement has 0 rows, data has 942

Upvotes: 2

Views: 1176

Answers (2)

akrun
akrun

Reputation: 887541

The syntax showed also use the python for loop and python list. Instead it would be a vector of strings in `R

for (col in c('col1','col2')) {
       df[[col]] <- factor(df[[col]])
  }

NOTE: here we use [[ instead of $ and the braces {}. The factor can be directly applied instead of as.character wrapping


Or with lapply where it can be done easily (without using any packages)

df[c('col1', 'col2')] <- lapply(df[c('col1', 'col2')], factor)

Or in dplyr, where it can be done more easily

library(dplyr)
df <- df %>%
          mutate_at(vars(col1, col2), factor)

Upvotes: 1

linog
linog

Reputation: 6226

In complement to @akrun solution, with data.table, this can be done easily:

library(data.table)
setDT(df)
df[,c("col1","col2") := lapply(.SD, function(c) as.factor(as.character(c))), .SDcols = c("col1","col2")]

Note that df is updated by reference (:=) so no need for reassignment

Upvotes: 1

Related Questions