vgXhc
vgXhc

Reputation: 89

Assign variable name as value with mutate_at and case_when

I have a dataframe with some information about the values stored in some of the variable names. Basically if the value of a cell is 0, it should assign NA; otherwise I'd like to assign the variable name as the value. Probably best to explain with this example:

library(tidyverse)

df <- tibble(observation = 0:5, var1 = as.character(0:5), var2 = as.character(1:6))

df
#> # A tibble: 6 x 3
#>   observation var1  var2 
#>         <int> <chr> <chr>
#> 1           0 0     1    
#> 2           1 1     2    
#> 3           2 2     3    
#> 4           3 3     4    
#> 5           4 4     5    
#> 6           5 5     6

df %>%
  mutate_at(vars(matches("var")),
            list(~case_when(. != "0" ~ "test")))
#> # A tibble: 6 x 3
#>   observation var1  var2 
#>         <int> <chr> <chr>
#> 1           0 <NA>  test 
#> 2           1 test  test 
#> 3           2 test  test 
#> 4           3 test  test 
#> 5           4 test  test 
#> 6           5 test  test

Created on 2020-02-20 by the reprex package (v0.3.0)

What I would like the table to look like is:

#> # A tibble: 6 x 3
#>   observation var1  var2 
#>         <int> <chr> <chr>
#> 1           0 <NA>  var2 
#> 2           1 var1  var2 
#> 3           2 var1  var2 
#> 4           3 var1  var2 
#> 5           4 var1  var2 
#> 6           5 var1  var2

I can't figure out what to put into the case_when statement to access the variable name.

Thanks!

Upvotes: 1

Views: 801

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388817

We can get the data in long format, replace column name where value = 0 to NA and get data in wide format again.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = matches("var")) %>%
  mutate(value = replace(name, value == 0, NA)) %>%
  pivot_wider()

#  observation var1  var2 
#        <int> <chr> <chr>
#1           0 NA    var2 
#2           1 var1  var2 
#3           2 var1  var2 
#4           3 var1  var2 
#5           4 var1  var2 
#6           5 var1  var2 

In base R, we can do :

#get column name with "var" in it
cols  <- grep('var', names(df))
#get row/col index where value = 0
inds <- which(df[cols] == 0, arr.ind = TRUE)
#Replicate column names and assign
df[cols] <- rep(names(df[cols]), each = nrow(df))
#Replace 0 to NA 
df[cols][inds] <- NA

Upvotes: 2

akrun
akrun

Reputation: 886948

If we wrap with funs (getting deprecated), there is an option to substitute

library(dplyr)
df %>% 
   mutate_at(vars(starts_with('var')), 
         funs(case_when(.!= 0  ~  deparse(substitute(.)))))
# A tibble: 6 x 3
#  observation var1  var2 
#        <int> <chr> <chr>
#1           0 <NA>  var2 
#2           1 var1  var2 
#3           2 var1  var2 
#4           3 var1  var2 
#5           4 var1  var2 
#6           5 var1  var2 

Or another option is imap

library(purrr)
df %>% 
   select(starts_with('var')) %>% 
   imap_dfc(~ case_when(. != 0 ~ .y)) %>% 
   bind_cols(df %>%
               select(-starts_with('var')), .)
# A tibble: 6 x 3
#  observation var1  var2 
#*       <int> <chr> <chr>
#1           0 <NA>  var2 
#2           1 var1  var2 
#3           2 var1  var2 
#4           3 var1  var2 
#5           4 var1  var2 
#6           5 var1  var2 

Upvotes: 1

Related Questions