Reputation: 1692
I have two data frames that both have a site ID
and Date
column. The first data frame (df1
) has continuous dates and also includes a temperature measurement (Temp.
) associated with each date. The second data frame (df2
) has the Date
of when the maximum temperature was reached at every site ID
. What I would like is to have an R code that determines if the site ID
and Date
in df2
match in df1
, then the associated temperature value from df1
gets added to df2
.
df1 <- data.frame(matrix(ncol = 3, nrow = 9))
x <- c("site ID", "Date", "Temp.")
colnames(df1) <- x
df1$`site ID` <- c("a","a","a",
"b","b","b",
"c","c","c")
df1$Date <- rep(seq(from = as.Date("2020-01-01"), to = as.Date("2020-01-03"), by = 1),3)
df1$Temp. <- c("10","12","11",
"20","15","10",
"2","4","6")
df2 <- data.frame(matrix(ncol = 2, nrow = 3))
y <- c("site ID", "Date")
colnames(df2) <- y
df2$`site ID` <- c("a","b","c")
df2$Date <- c(as.Date("2020-01-02"), as.Date("2020-01-01"), as.Date("2020-01-03"))
The ideal output would look like this below:
site ID Date Temp.
1 a 2020-01-02 12
2 b 2020-01-01 20
3 c 2020-01-03 6
Upvotes: 0
Views: 459
Reputation: 18642
In base R
you can use the merge
function to do a left-join (all.x = T
), which will keep all dates in df2
even if they're not found in df1
. If the date is in df2
, but not df1
then you will get an NA
for Temp
.
You can delete this if you want to do an inner-join where only matching dates in each dataframe are found.
merge(df2, df1, by = c("site ID", "Date"), all.x = T)
site ID Date Temp.
1 a 2020-01-02 12
2 b 2020-01-01 20
3 c 2020-01-03 6
Upvotes: 1