Assign variable name as value with mutate_at and case_when

Question

I have a dataframe with some information about the values stored in some of the variable names. Basically if the value of a cell is 0, it should assign NA; otherwise I'd like to assign the variable name as the value. Probably best to explain with this example:

library(tidyverse)

df <- tibble(observation = 0:5, var1 = as.character(0:5), var2 = as.character(1:6))

df
#> # A tibble: 6 x 3
#>   observation var1  var2 
#>           
#> 1           0 0     1    
#> 2           1 1     2    
#> 3           2 2     3    
#> 4           3 3     4    
#> 5           4 4     5    
#> 6           5 5     6

df %>%
  mutate_at(vars(matches("var")),
            list(~case_when(. != "0" ~ "test")))
#> # A tibble: 6 x 3
#>   observation var1  var2 
#>           
#> 1           0   test 
#> 2           1 test  test 
#> 3           2 test  test 
#> 4           3 test  test 
#> 5           4 test  test 
#> 6           5 test  test

^{Created on 2020-02-20 by the reprex package (v0.3.0)}

What I would like the table to look like is:

#> # A tibble: 6 x 3
#>   observation var1  var2 
#>           
#> 1           0   var2 
#> 2           1 var1  var2 
#> 3           2 var1  var2 
#> 4           3 var1  var2 
#> 5           4 var1  var2 
#> 6           5 var1  var2

I can't figure out what to put into the case_when statement to access the variable name.

Thanks!

Ronak Shah · Accepted Answer

We can get the data in long format, replace column name where value = 0 to NA and get data in wide format again.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = matches("var")) %>%
  mutate(value = replace(name, value == 0, NA)) %>%
  pivot_wider()

#  observation var1  var2 
#          
#1           0 NA    var2 
#2           1 var1  var2 
#3           2 var1  var2 
#4           3 var1  var2 
#5           4 var1  var2 
#6           5 var1  var2

In base R, we can do :

#get column name with "var" in it
cols  <- grep('var', names(df))
#get row/col index where value = 0
inds <- which(df[cols] == 0, arr.ind = TRUE)
#Replicate column names and assign
df[cols] <- rep(names(df[cols]), each = nrow(df))
#Replace 0 to NA 
df[cols][inds] <- NA

Assign variable name as value with mutate_at and case_when

Answers (2)

Related Questions