Reputation: 5
From the code below, df is a data frame that contains ID and Date variables. df1 is a fixed data frame. I wish to create a new vector with these conditions: If Date from df is between Start Date and End Date in df1 and ID from df is equal to ID1 in df1, then the code will return the corresponding Result from df1. However I got these warning messages as shown in the codes below. Please help.
> Date = as.Date(c("01/01/2012", "01/02/2015", "01/01/2018", "01/05/2019"), format = '%d/%m/%Y')
> ID = c(1,2,3,1)
> df = data.frame(ID, Date)
>
> Start_Date = as.Date(c("01/01/2011", "01/01/2011", "01/01/2019"), format = '%d/%m/%Y')
> End_Date = as.Date(c("31/12/2018", "31/12/2019", "31/12/2019"), format = '%d/%m/%Y')
> ID1 = c(1,2,3)
> Result =c("A","B","C")
> df1 = data.frame(ID1,Start_Date,End_Date, Result)
>
> for(i in 1:nrow(df1)) {
+ if(Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) {Result[i]}
+ }
Warning messages:
1: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
the condition has length > 1 and only the first element will be used
2: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
the condition has length > 1 and only the first element will be used
3: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
the condition has length > 1 and only the first element will be used
Upvotes: 0
Views: 39
Reputation: 388807
You can merge
and then filter the rows if they are in range :
subset(merge(df, df1, by.x = 'ID', by.y = 'ID1'),
Date >= Start_Date & Date <= End_Date)
# ID Date Start_Date End_Date Result
#1 1 2012-01-01 2011-01-01 2018-12-31 A
#3 2 2015-02-01 2011-01-01 2019-12-31 B
Using dplyr
this can be done as :
library(dplyr)
inner_join(df, df1, by = c('ID' = 'ID1')) %>%
filter(Date >= Start_Date & Date <= End_Date)
Or with fuzzyjoin
.
fuzzyjoin::fuzzy_inner_join(df, df1,
by = c('ID' = 'ID1', 'Date' = 'Start_Date', 'Date' = 'End_Date'),
match_fun = list(`==`, `>=`, `<=`))
Upvotes: 1