Eng Kheng
Eng Kheng

Reputation: 5

Looping in R to read data from a data frame

From the code below, df is a data frame that contains ID and Date variables. df1 is a fixed data frame. I wish to create a new vector with these conditions: If Date from df is between Start Date and End Date in df1 and ID from df is equal to ID1 in df1, then the code will return the corresponding Result from df1. However I got these warning messages as shown in the codes below. Please help.

> Date = as.Date(c("01/01/2012", "01/02/2015", "01/01/2018", "01/05/2019"), format = '%d/%m/%Y')
> ID = c(1,2,3,1)
> df = data.frame(ID, Date)
> 
> Start_Date = as.Date(c("01/01/2011", "01/01/2011", "01/01/2019"), format = '%d/%m/%Y')
> End_Date = as.Date(c("31/12/2018", "31/12/2019", "31/12/2019"), format = '%d/%m/%Y')
> ID1 = c(1,2,3)
> Result =c("A","B","C")
> df1 = data.frame(ID1,Start_Date,End_Date, Result)
> 
> for(i in 1:nrow(df1)) {
+ if(Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) {Result[i]}
+ }
Warning messages:
1: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used
2: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used
3: In if (Date >= Start_Date[i] & Date <= End_Date[i] & ID == ID1[i]) { :
  the condition has length > 1 and only the first element will be used

Upvotes: 0

Views: 39

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

You can merge and then filter the rows if they are in range :

subset(merge(df, df1, by.x = 'ID', by.y = 'ID1'), 
              Date >= Start_Date & Date <= End_Date)

#  ID       Date Start_Date   End_Date Result
#1  1 2012-01-01 2011-01-01 2018-12-31      A
#3  2 2015-02-01 2011-01-01 2019-12-31      B

Using dplyr this can be done as :

library(dplyr)
inner_join(df, df1, by = c('ID' = 'ID1')) %>%
   filter(Date >= Start_Date & Date <= End_Date)

Or with fuzzyjoin.

fuzzyjoin::fuzzy_inner_join(df, df1,  
      by = c('ID' = 'ID1', 'Date' = 'Start_Date', 'Date' = 'End_Date'), 
      match_fun = list(`==`, `>=`, `<=`))

Upvotes: 1

Related Questions