Reputation: 1597
I've been struggling to conditionally calculate a time difference across multiple rows/columns. I would like to calculate by id the time difference between an initial negative result and a subsequent positive. I've been trying to do this within dplyr but maybe I need another approach.
Code for data :
id <- c(1,1,1,1,2,2,2,2,3,3,3,3)
firstnegative <-c('T', 'F','F','F', 'F','F','F','F','T','F','F','F')
organism <- c('neg', 'COVID', 'COVID', 'neg', 'COVID', 'neg', 'neg', 'neg', 'neg', 'neg', 'COVID', 'COVID')
date <- seq(as.Date("2020/3/1"), as.Date('2020/3/12'), "days")
data <- data.frame (id,date, organism, firstnegative)
id date organism firstnegative
1 2020-03-01 neg T
1 2020-03-02 COVID F
1 2020-03-03 COVID F
1 2020-03-04 neg F
2 2020-03-05 COVID F
2 2020-03-06 neg F
2 2020-03-07 neg F
2 2020-03-08 neg F
3 2020-03-09 neg T
3 2020-03-10 neg F
3 2020-03-11 COVID F
3 2020-03-12 COVID F
Expected Result
id date organism firstnegative timediff
1 2020-03-01 neg T 1d
1 2020-03-02 COVID F
1 2020-03-03 COVID F
1 2020-03-04 neg F
2 2020-03-05 COVID F
2 2020-03-06 neg F
2 2020-03-07 neg F
2 2020-03-08 neg F
3 2020-03-09 neg T 2d
3 2020-03-10 neg F
3 2020-03-11 COVID F
3 2020-03-12 COVID F
The length between the first negative and subsequent positives is unknown and changes. I can't assume a lead of 1.
Any thoughts/approaches would be greatly appreciated.
Upvotes: 0
Views: 98
Reputation: 388982
For each id
you can find the first date of 'COVID'
and subtract it with first time 'T'
was present in firstnegative
. Since we want value only on first row we replace
rest of the values with NA
.
library(dplyr)
data %>%
group_by(id) %>%
mutate(timediff = date[match('COVID', organism)] -
date[match('T', firstnegative)],
timediff = as.numeric(replace(timediff, -1L, NA)))
# id date organism firstnegative timediff
# <dbl> <date> <chr> <chr> <dbl>
# 1 1 2020-03-01 neg T 1
# 2 1 2020-03-02 COVID F NA
# 3 1 2020-03-03 COVID F NA
# 4 1 2020-03-04 neg F NA
# 5 2 2020-03-05 COVID F NA
# 6 2 2020-03-06 neg F NA
# 7 2 2020-03-07 neg F NA
# 8 2 2020-03-08 neg F NA
# 9 3 2020-03-09 neg T 2
#10 3 2020-03-10 neg F NA
#11 3 2020-03-11 COVID F NA
#12 3 2020-03-12 COVID F NA
Upvotes: 1