Reputation: 61

Create a new variable with an existing variable name in a data frame, filling it when matching a non NA value in each of the variable lists

I want to create a column - C - in dfABy with the name of the existing variables, when in the list A or B it is a "non NA" value. For example, my df is:

The result what I will attend is:

Upvotes: 2

Answers (4)

Anoushiravan R

Reputation: 21938

This is just another solution, However other proposed solutions are better.

library(dplyr)
library(purrr)

df %>%
  rowwise() %>%
  mutate(C = detect_index(c(A, B), ~ !is.na(.x)), 
         C = names(.[C]))

# A tibble: 5 x 3
# Rowwise: 
      A     B C    
  <dbl> <dbl> <chr>
1    56    NA A    
2    NA    45 B    
3    NA    77 B    
4    67    NA A    
5    NA    65 B

Upvotes: 1

Ronak Shah

Reputation: 389265

You can use max.col over is.na values to get the column numbers where non-NA value is present. From those numbers you can get the column names.

dfABy$C <- names(dfABy)[max.col(!is.na(dfABy))] 
dfABy

#   A  B C
#1 56 NA A
#2 NA 45 B
#3 NA 77 B
#4 67 NA A
#5 NA 65 B

If there are more than one non-NA value in a row take a look at at ties.method argument in ?max.col on how to handle ties.

data

dfABy <- structure(list(A = c(56L, NA, NA, 67L, NA), B = c(NA, 45L, 77L, 
NA, 65L)), class = "data.frame", row.names = c(NA, -5L))

Upvotes: 2

koolmees

Reputation: 2783

Using the data.table package I recommend:

dfABy[, C := apply(cbind(dfABy), 1, function(x) names(x[!is.na(x)]))]

creating the following output:

    A   B   C
1   56  NA  A
2   NA  45  B
3   NA  77  B
4   67  NA  A
5   NA  65  B

Upvotes: 1

tmfmnk

Reputation: 40171

One option using dplyr could be:

df %>%
    rowwise() %>%
    mutate(C = names(.[!is.na(c_across(everything()))]))

      A     B C    
  <int> <int> <chr>
1    56    NA A    
2    NA    45 B    
3    NA    77 B    
4    67    NA A    
5    NA    65 B

Or with the addition of purrr:

df %>%
    mutate(C = pmap_chr(across(A:B), ~ names(c(...)[!is.na(c(...))])))

Upvotes: 2

Create a new variable with an existing variable name in a data frame, filling it when matching a non NA value in each of the variable lists

Answers (4)

Related Questions