bvowe
bvowe

Reputation: 3394

New variable in R with IF statement

Say I have columns X1 and X2 and X3. Basically: I want to make a new variable Z that is equal to the first available value from X1-X3. If values are missing for all X1-X3 I want to set Z to NA. Thanks so much!

X1  X2  X3  Z 
1   2   3   1 
NA  5   6   5 
NA   NA  9  9 
NA   NA  3  3  
NA  NA  NA   

Upvotes: 0

Views: 68

Answers (2)

MKR
MKR

Reputation: 20095

I find dplyr::coalesce very useful and handy to handle this kind of scenarios. Since OP is interested in replacing NA considering all columns, hence dplyr::coalesce(!!!df1) will provide even easier option.

Once can try as:

library(dplyr)
df1 <- df1 %>% mutate(Z = coalesce(X1, X2,X3))

#OR Even simpler option could be as
df1$Z <- dplyr::coalesce(!!!df1)

df1
#   X1 X2 X3  Z
# 1  1  2  3  1
# 2 NA  5  6  5
# 3 NA NA  9  9
# 4 NA NA  3  3
# 5 NA NA NA NA

Data:

df1 <- read.table(text = 
"X1  X2  X3
 1   2   3   
 NA  5   6   
 NA  NA  9  
 NA  NA  3   
 NA  NA  NA",
 header = TRUE, stringsAsFactors = FALSE)

Upvotes: 3

akrun
akrun

Reputation: 887891

We can use max.col to get the column index of first non-NA element for each row, cbind with the row index (seq_len(nrow(df1))) and extract the element

j1 <- max.col(!is.na(df1), "first")
df1$z <- df1[cbind(seq_len(nrow(df1)), j1)]
df1
#  X1 X2 X3  z
#1  1  2  3  1
#2 NA  5  6  5
#3 NA NA  9  9
#4 NA NA  3  3
#5 NA NA NA NA

data

df1 <- structure(list(X1 = c(1L, NA, NA, NA, NA), X2 = c(2L, 5L, NA, 
 NA, NA), X3 = c(3L, 6L, 9L, 3L, NA)), row.names = c(NA, -5L), 
class = "data.frame")

Upvotes: 2

Related Questions