Marcelo Avila
Marcelo Avila

Reputation: 2374

How to use a list as a dictionary to mutate or create a new variable but keeping its type (factor) and levels ordering

I hope the clear for what I am looking for. I have a named list with variable names and its contents are the nicer looking variable labels. I wish to replace or create new variable with the labels while keeping the same characteristics of the original column.

library(dplyr)


list_of_dict <- list(var_x = "Label of Variable X", var_y = "Label of Variable Y")
list_of_dict
#> $var_x
#> [1] "Label of Variable X"
#> 
#> $var_y
#> [1] "Label of Variable Y"

tib <- tibble(vars = factor(c("var_x", "var_y")))
tib
#> # A tibble: 2 × 1
#>   vars 
#>   <fct>
#> 1 var_x
#> 2 var_y


extract_varlabs_from_list <- function(x, list) list[as.character(x)] %>% unlist()
# the `as.character` is necessary so the order of factor does not mess up 
# with the order of the list. 

tib %>% mutate(
  vars_labelled = extract_varlabs_from_list(vars, list_of_dict)
)
#> # A tibble: 2 × 2
#>   vars  vars_labelled      
#>   <fct> <chr>              
#> 1 var_x Label of Variable X
#> 2 var_y Label of Variable Y

Created on 2022-10-06 with reprex v2.0.2

This is nearly what I need, but I would like to keep the characteristics of vars such as being a factors and, importantly, the same levels ordering.

Upvotes: 0

Views: 254

Answers (1)

Ottie
Ottie

Reputation: 1030

Under the hood, a factor array is an integer array with labels (levels). You can rename the labels alone without touching the underlying array.

levels(tib$vars) <- c("Label of Variable X", "Label of Variable Y")

You could also duplicate and rename the column if you want to retain the original, and if you have a lot of levels, you can inspect the list of labels first with

levels(tib$vars)

If you prefer to use a dictionary approach, I would recommend using a named array rather than a list, as that allows for simpler indexing

dict <- setNames(
    c("Label of Variable X", "Label of Variable Y"),
    c("var_x", "var_y")
)


levels(tib$vars) <- dict[levels(tib$vars)]

Upvotes: 0

Related Questions