tmfmnk
tmfmnk

Reputation: 40141

ifelse() and if else given different results in dplyr mutate() for a time variable

Let assume a data.frame like this:

df <- read.table(text = "ID Date Condition
                1 2015/01/01  Yes
                1 2015/01/10  No        
                1 2015/01/15  Yes
                2 2015/02/10  No                                   
                2 2015/03/08  No
                3 2015/01/01  No                                     
                3 2015/04/01  Yes
                3 2015/04/10  No
                3 2015/04/01  Yes
                3 2015/04/10  No", header = TRUE)

I want to compute the number of days between a given date and the first date for every ID separately. Now, for every ID where the condition is always "No", I want to assign NA in the column with results.

This is my code:

df %>%
  mutate(Date = as.Date(Date, "%Y/%m/%d")) %>%
  group_by(ID) %>%
  mutate(Temp = Date - first(Date),
         Res1 = ifelse(all(Condition == "No"), NA, Temp),
         Res2 = if(all(Condition == "No")) NA else Temp)

Results:

      ID Date       Condition Temp    Res1 Res2  
   <int> <date>     <fct>     <time> <dbl> <time>
 1     1 2015-01-01 Yes       0         0. 0     
 2     1 2015-01-10 No        9         0. 9     
 3     1 2015-01-15 Yes       14        0. 14    
 4     2 2015-02-10 No        0        NA  <NA>  
 5     2 2015-03-08 No        26       NA  <NA>  
 6     3 2015-01-01 No        0         0. 0     
 7     3 2015-04-01 Yes       90        0. 90    
 8     3 2015-04-10 No        99        0. 99    
 9     3 2015-04-01 Yes       90        0. 90    
10     3 2015-04-10 No        99        0. 99 

My question is, what is the reason for ifelse() giving wrong results, while if else giving the desired results?

Upvotes: 1

Views: 618

Answers (1)

Roland
Roland

Reputation: 132874

Apparently, you do not understand ifelse. It is fundamentally different from if and else. The documentation clearly says "ifelse returns a value with the same shape as test" which is a vector of length one in your example. mutate then recycles this.

Here is a simple example:

all(c(TRUE, TRUE))
#[1] TRUE
ifelse(all(c(TRUE, TRUE)), 1:2, 3:4) #test is vector of length 1
#[1] 1
ifelse(c(TRUE, FALSE), 1:2, 3:4) #test is vector of length 2
#[1] 1 4

I'd encourage you to study the source code of the ifelse function, which should make it obvious why it behaves like this.

Upvotes: 9

Related Questions