Mikee
Mikee

Reputation: 854

ifelse not looping over rows as expected

I have data that looks like this:

 df <- read.table(tc <- textConnection("
     var1    var2    var3    var4
      1       1       7      NA
      4       4       NA      6
      2       NA      3       NA                
      4       4       4       4              
      1       3       1      1"), header = TRUE); close(tc)

I'm trying to create a new column that returns 1 if there's a match or 0 if none.

My non-working code looks like this:

 df$var5 = ifelse("1" %in% df$var1,1,
                ifelse("1" %in% df$var2,1,
                      ifelse("1" %in% df$var3,1,
                           ifelse("1" %in% df$var4,1,0))))

giving me a table:

 var1    var2    var3    var4   var5
      1       1       7      NA      1
      4       4       NA      6      1
      2       NA      3      NA      1         
      4       4       4       4      1        
      1       3      1        1      1

The table I actually want should look like

    var1    var2    var3    var4   var5
      1       1       7      NA      1
      4       4       NA      6      0
      2       NA      3      NA      0         
      4       4       4       4      0        
      1       3      1        1      1

I've looked at the posts:

ifelse not working as expected in R

and

Loop over rows of dataframe applying function with if-statement

but I couldn't get any answer to my problem.

Upvotes: 3

Views: 319

Answers (2)

akrun
akrun

Reputation: 887148

The correct way should be

with(df, ifelse(var1 %in% 1,1,
            ifelse(var2 %in% 1,1,
                  ifelse(var3 %in% 1,1,
                       ifelse(var4 %in% 1,1,0)))))
#[1] 1 0 0 0 1

The reason is that 1 %in% df1$var1 returns only a single element that 1.

1 %in% df$var1
#[1] TRUE

likewise, in all all the columns, there is 1, so it will return TRUE for all the ifelse, resulting in value 1.

whereas the opposite

df$var1 %in% 1
#[1]  TRUE FALSE FALSE FALSE  TRUE

returns the logical vector with the same length as the original column. In essence, by using %in%, the length returned will be based on the length of the object in the lhs of %in%


It is not required to have ifelse, a better option would be, using rowSum on the logical matrix (df ==1), and check whether it is not equal to 0, convert to binary with as.integer.

as.integer(rowSums(df == 1, na.rm =TRUE)!=0)
#[1] 1 0 0 0 1

Or another option is Reduce with |

as.integer(Reduce(`|`, lapply(replace(df, is.na(df), 0), `==`, 1)))
#[1] 1 0 0 0 1

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

Instead of using ifelse separately for every column you can check row wise if 1 exists in the entire row and then return 1 or 0 accordingly

as.numeric(apply(df, 1, function(x) any(x == 1)) %in% TRUE) 
#[1] 1 0 0 0 1

Just to explain the steps better:

apply(df, 1, function(x) any(x == 1))
#[1]  TRUE    NA    NA FALSE  TRUE

apply(df, 1, function(x) any(x == 1)) %in% TRUE
#[1]  TRUE FALSE FALSE FALSE  TRUE

as.numeric(apply(df, 1, function(x) any(x == 1)) %in% TRUE)
#[1] 1 0 0 0 1

Upvotes: 1

Related Questions