antares
antares

Reputation: 35

Calculate total observation days in R (lubridate + dplyr)

I am working with a dataset where each row contains a separate observation of an individual (ID). I would like R to look for the first and last observation dates and calculate the total number of "activity days", putting these values in a new column. When I run the code below, I get one column with the same number for every entry — the total number of days between the earliest and latest days of the entire data set, not per individual.

I’m using dplyr and lubridate. I used this page (difference between the first date and last date within same individual in R) for code assistance, but it isn't working for me.

Can anyone help me figure out where I’m going wrong?

data$date <- mdy(data$date)

data <- data %>% 
  group_by(ID) %>% 
  mutate(total.activity.days = max((date)) - min((date)))

   cage   date              ID   total.activity.days
1   1     2018-04-30        40   54
2   1     2018-04-30        76   54 
3   1     2018-05-02        40   54 
4   1     2018-05-02        76   54
5   1     2018-05-04        40   54
6   1     2018-05-04        76   54
7   1     2018-04-28        59   54
8   1     2018-04-29        59   54

Upvotes: 1

Views: 681

Answers (2)

Adam Warner
Adam Warner

Reputation: 1354

All that is needed is:

df %>% group_by(ID) %>% mutate(total.activity.days = max(date) - min(date) )

Which gives:

cage    date          ID total.activity.days
  <int> <date>     <int> <time>             
1     1 2018-04-30    40 4                  
2     1 2018-04-30    76 4                  
3     1 2018-05-02    40 4                  
4     1 2018-05-02    76 4                  
5     1 2018-05-04    40 4                  
6     1 2018-05-04    76 4                  
7     1 2018-04-28    59 1                  
8     1 2018-04-29    59 1 

Upvotes: 1

SmitM
SmitM

Reputation: 1376

Try this out:

data_new <- data %>% 
    group_by(ID) %>% 
    summarise(MaxDate = max(date),
              MinDate = min(date)) %>% 
    mutate(total.activity.days = MaxDate - MinDate) %>% 
    select(ID, total.activity.days)

Upvotes: 1

Related Questions