Reputation: 225
I have a large clinical dataset that I am planning to populate with additional columns and the criterion will almost be similar and hence it probably comes down to one similar problem.
I have so far figured out that, first I need to group my entries based on patient_id but I have been unable to proceed from here.
Below is a snapshot of the data. When copied and ran in R, it creates a data.frame called myDF
myDF <- structure(list(patient_id = c(1L, 1L, 1L, 1L, 1L), date = structure(c(17167,
17168, 17169, 17170, 17171), class = "Date"), date_recruited = c("yes",
"", "", "", ""), ill = c("no", "no", "yes", "yes", "no")), class = "data.frame", .Names = c("id",
"date", "date_recruited", "ill"), row.names = c(NA, -5L))
I would want to create a new column (let's call it "drop"), such that, for every id, if the difference between date when ill == "yes" and date_recruited = 3, populate with drop.
something like this:
myDF2 <- structure(list(paitent_id = c(1L, 1L, 1L, 1L, 1L), date = structure(c(17167,
17168, 17169, 17170, 17171), class = "Date"), date_recruited = c("yes",
"", "", "", ""), ill = c("no", "no", "yes", "yes", "no"), drop = c("",
"", "", "drop", "")), class = "data.frame", .Names = c("paitent_id",
"date", "date_recruited", "ill", "drop"), row.names = c(NA, -5L
))
Any assistance is welcome...
Upvotes: 0
Views: 115
Reputation: 18435
In dplyr
you could do the following.
myDF2 <- myDF %>% group_by(id) %>% mutate(recdate=date[which(date_recruited=="yes")[1]],
drop=ifelse(ill=="yes" & date==recdate+3,"drop",""))
Upvotes: 1