Reputation: 49
I have a longitudinal dataset with mistakes in a date variable. Here is an example:
ID 1 has as first date in the first row 2013-07-17
. The difference to study begin (2012-08-29
) is 321
days. In the next row the visit date is 2013-02-15
and the difference to study begin (2012-08-29
) is 169
days. Therefore there must be an error with the date of the visit 2013-07-17
because the visits are in ascending order.
I tried:
dat$DifferenceDateerror <- "no"
i <- 1
for(i in 1:nrow(dat)){
if(dat[i,"DifferenceDate"] > dat[i+1,"DifferenceDate"] & !is.na(dat$DifferenceDate)[i])
{dat$DifferenceDateerror[i]=="yes"}
}
but got the following error:
error in if (dat[i, "DifferenceDate"] > dat[i + 1, : missing value, where TRUE/FALSE is needed
I would like to find out where the Date must be wrong.
Upvotes: 3
Views: 71
Reputation: 389012
Since you want to add "yes"
/"no"
values where the current date is greater than next date, we can use diff
to compare consecutive dates and assign values accordingly.
df$DifferenceDateerror <- c("no", "yes")[c(FALSE, diff(dat$DifferenceDate) < 0)+ 1]
Or similarly with head
and tail
df$DifferenceDateerror <- c("no", "yes")[c(FALSE, head(x, -1) > tail(x, -1)) + 1]
Upvotes: 1