mihagazvoda
mihagazvoda

Reputation: 1367

R: Automatically convert column types based on the values

I have a tibble with lots of columns. I don't want to change them one by one. Let's say that tible looks like this:

df <- tibble(
  x = c(1,0,1,1,'a'), 
  y = c('A', 'B', 1, 'D', 'A'), 
  z = c(1/3, 4, 5/7, 100, 3)
)

I want to convert their column types based on value in other tibble:

df_map <- tibble(
  col = c('x','y','z'), 
  col_type = c('integer', 'string', 'float')
)

What's the most appropriate solution?

Upvotes: 2

Views: 2554

Answers (2)

moodymudskipper
moodymudskipper

Reputation: 47300

I would use the package readr for such task, it's part of tidyverse

suppressPackageStartupMessages(library(tidyverse))

# rework your col types to be compatible with ?readr::cols
df_map$col_type <- recode(df_map$col_type, integer = "i", float = "d" , string = "c")

# make a vector out of df_map
vec_map <- deframe(df_map)
vec_map
#>   x   y   z 
#> "i" "c" "d"

# convert according to your specs
type_convert(df,exec(cols, !!!vec_map))
#> Warning in type_convert_col(char_cols[[i]], specs$cols[[i]],
#> which(is_character)[i], : [4, 1]: expected an integer, but got 'a'
#> # A tibble: 5 x 3
#>       x y           z
#>   <int> <chr>   <dbl>
#> 1     1 A       0.333
#> 2     0 B       4    
#> 3     1 1       0.714
#> 4     1 D     100    
#> 5    NA A       3

Upvotes: 3

shs
shs

Reputation: 3901

Try the following:

library(purrr)
map2_dfc(df, df_map$col_type, type.convert, as.is = T)

This code assumes that df_map$col is in the same order as names(df) (thanks to @Moody_Mudskipper for pointing that out).

As @NelsonGon points out, the appropriate data types in R would be "integer", "character" and "double".

Edit to include the prior modification of boolean variables, as requested in the comment:

library(tidyverse)
df %>% 
  mutate_if(~identical(sort(unique(.)), c(1,2)), ~{. - 1}) %>% 
  map2_dfc(df_map$col_type, type.convert, as.is = T)

Upvotes: 5

Related Questions