Cristhian
Cristhian

Reputation: 371

Create a new variable containing the column name in case the value isn't NA

Suppose I have the following dataset:

data = tibble::tibble(
  id = c("x", "y", "x"),
  inputA = c(1, NA, NA),
  inputB = c(2, 1, NA),
  inputC = c(3, 2, 3)
)

Which looks like this:

# A tibble: 3 x 4
  id    inputA inputB inputC
  <chr>  <dbl>  <dbl>  <dbl>
1 x          1      2      3
2 y         NA      1      2
3 x         NA     NA      3

And I want to create a variable for each id (each unique row) which identifies what input the id has. I mean, the new variable should indicate what input the id has if the input variable isn't missing (NA).

The desired output should look like this:

# A tibble: 3 x 5
  id    inputA inputB inputC inputs              
  <chr>  <dbl>  <dbl>  <dbl> <chr>               
1 x          1      2      3 inputA-inputB-inputC
2 y         NA      1      2 inputB-inputC       
3 x         NA     NA      3 inputC   

The variable I want to create is inputs

Upvotes: 2

Views: 57

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388797

Using rowwise in dplyr :

library(dplyr)

cols <- names(data)[-1]

data %>%
  rowwise() %>%
  mutate(inputs = paste0(cols[!is.na(c_across(all_of(cols)))], collapse = '-'))

#   id    inputA inputB inputC inputs              
#  <chr>  <dbl>  <dbl>  <dbl> <chr>               
#1 x          1      2      3 inputA-inputB-inputC
#2 y         NA      1      2 inputB-inputC       
#3 x         NA     NA      3 inputC              

In base R :

data$inputs <- apply(!is.na(data[cols]), 1, function(x) 
                     paste0(cols[x], collapse = '-'))

Upvotes: 3

Related Questions