Reputation: 1667
I would like to find the intersection of two dataframes based on the date column.
Previously, I have been using this command to find the intersect of a yearly date column (where the date only contained the year)
common_rows <-as.Date(intersect(df1$Date, df2$Date), origin = "1970-01-01")
But now my date column for df1 is of type date and looks like this:
1985-01-01
1985-04-01
1985-07-01
1985-10-01
My date column for df2 is also of type date and looks like this (notice the days are different)
1985-01-05
1985-04-03
1985-07-07
1985-10-01
The above command works fine when I keep the format like this (i.e year, month and day) but since my days are different and I am interested in the monthly intersection I dropped the days like this, but that produces and error when I look for the intersection:
df1$Date <- format(as.Date(df1$Date), "%Y-%m")
common_rows <-as.Date(intersect(df1$Date, df2$Date), origin = "1970-01-01")
Error in charToDate(x) :
character string is not in a standard unambiguous format
Is there a way to find the intersection of the two datasets, based on the year and month, while ignoring the day?
Upvotes: 0
Views: 1062
Reputation: 2017
The problem is the as.Date()
function wrapping your final output. I don't know if you can convert incomplete dates to date objects. If you are fine with simple strings then use common_rows <-intersect(df1$Date, df2$Date)
. Otherwise, try:
common_rows <-as.Date(paste(intersect(df1$Date, df2$Date),'-01',sep = ''), origin = "1970-01-01")
Upvotes: 2
Reputation: 21739
Try this:
date1 <- c('1985-01-01','1985-04-01','1985-07-01','1985-10-01')
date2 <- c('1985-01-05','1985-04-03','1985-07-07','1985-10-01')
# extract the part without date
date1 <- sapply(date1, function(j) substr(j, 1, 7))
date2 <- sapply(date2, function(j) substr(j, 1, 7))
print(intersect(date1, date2))
[1] "1985-01" "1985-04" "1985-07" "1985-10"
Upvotes: 1