Cuchulainn
Cuchulainn

Reputation: 61

converting columns to factor over list of dataframes

I'm trying to convert several columns in a list of dataframes into factors. I've tried this, but it doesn't seem to convert the columns into factors:

factor_cols_REx <- c('GESLACHT','GEVKL','BEROEP')
for (i in (1:9)) {
  dataset_RE10_2014[[i]] <- lapply(dataset_RE10_2014[[i]][factor_cols_REx],factor)
  dataset_RE10_2015[[i]] <- lapply(dataset_RE10_2015[[i]][factor_cols_REx],factor)
}

Any ideas on how to fix this?

Upvotes: 0

Views: 418

Answers (3)

d.b
d.b

Reputation: 32548

Let me know if I understood correctly

#DATA
dat = list(A = mtcars, B = mtcars)
#Columns we want to convert to factor
factor_cols = c("mpg", "hp")

#Go through the list using lapply and change specific columns to factor in each sub-group
#Modified from https://stackoverflow.com/a/33180265/7128934
dat2 = lapply(dat, function(x){
     x[factor_cols] = lapply(x[factor_cols], factor)
     x
    })

#Check class in output list
lapply(dat2, function(x) sapply(x, class))
#$A
#      mpg       cyl      disp        hp      drat        wt      qsec        vs        am      gear      carb 
# "factor" "numeric" "numeric"  "factor" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 

#$B
#      mpg       cyl      disp        hp      drat        wt      qsec        vs        am      gear      carb 
# "factor" "numeric" "numeric"  "factor" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 

#Check class in input list
lapply(dat, function(x) sapply(x, class))
#$A
#      mpg       cyl      disp        hp      drat        wt      qsec        vs        am      gear      carb 
#"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 

#$B
#      mpg       cyl      disp        hp      drat        wt      qsec        vs        am      gear      carb 
#"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" 

Upvotes: 4

Jake Kaupp
Jake Kaupp

Reputation: 8072

An approach using dplyr and purrr

library(dplyr)
library(purrr) 

factor_cols_REx <- c('GESLACHT','GEVKL','BEROEP')

dataset_RE10_2014 <- map(dataset_RE10_2014, ~mutate_at(.x, factor_cols_REx, factor))

dataset_RE10_2015 <- map(dataset_RE10_2015, ~mutate_at(.x, factor_cols_REx, factor))

Upvotes: 2

akrun
akrun

Reputation: 887193

We need to have the same subset on the LHS and RHS of <-

for (i in (1:9)) {

  dataset_RE10_2014[[i]][factor_cols_REx] <- lapply(dataset_RE10_2014[[i]][factor_cols_REx], 
                            factor)
  dataset_RE10_2015[[i]][factor_cols_REx] <- lapply(dataset_RE10_2015[[i]][factor_cols_REx],
                            factor)

}

Upvotes: 1

Related Questions