Reputation: 463
Imagine two datasets. Dataset df1 is the day of the year that a certain event occurred. Df1 has multiple other columns, but for the purpose of this question, only day matters. Df2 contains two columns, the day of the year a measurement was made, and the value of that measurement. Essentially, I want to create a new column ("AVG") in df1 that is the mean value of that day in df2 plus the value from the previous day. For example, the AVG for day 2 in df1 would be 12.5 ((10+15)/2).
Some example data below.
df1 <- structure(list(day = c(2, 5, 7)), class = "data.frame", row.names = c(NA,
-3L))
df2 <- structure(list(day = c(1, 2, 3, 4, 5, 6, 7, 8), value = c(10,
15, 8, 13, 7, 20, 25, 12)), class = "data.frame", row.names = c(NA,
-8L))
Example final product below.
df3 <- structure(list(day = c(2, 5, 7), AVG = c(12.5, 10, 22.5)), class = "data.frame", row.names = c(NA,
-3L))
Upvotes: 2
Views: 291
Reputation: 35584
You can use lag()
from dplyr
to get the previous value so that you can calculate the rolling means.
library(dplyr)
df2 %>%
mutate(AVG = (value + lag(value)) / 2, .keep = "unused") %>%
right_join(df1, by = "day")
# day AVG
# 1 2 12.5
# 2 5 10.0
# 3 7 22.5
Or by zoo::rollmeanr()
:
df2 %>%
mutate(AVG = zoo::rollmeanr(value, 2, fill = NA), .keep = "unused") %>%
right_join(df1, by = "day")
# day AVG
# 1 2 12.5
# 2 5 10.0
# 3 7 22.5
Upvotes: 1
Reputation: 39603
You can play with indexes around positions like this using base R
. It is easy to compute the mean as you only want that measure for only two values. Finally, you can assign the result to your df1
:
#Detect position
i1 <- which(df2$day %in% df1$day)
#Extract values
j1 <- df2$value[i1]
j2 <- df2$value[i1-1]
#Compute mean
j3 <- (j1+j2)/2
#Assign
df1$AVG <- j3
Output:
day AVG
1 2 12.5
2 5 10.0
3 7 22.5
Upvotes: 1