Reputation: 896
I'm trying to calculate group “weighted” rolling mean while excluding own group value when a group has multiple observations. This is related to my earlier question group "weighted" mean with multiple grouping variables and excluding own group value. The key difference is that this method is not readily applicable since now a group has multiple observations.
Based on the following dataset, here's the operation I want to apply. For instance, the new variable for the first two rows will take 19*9/18 + 48*3/18 + 6*2/18 + 31*4/18 = 25.05. The next two rows will take 81*1/10 + 52*3/10 + 6*2/10 + 31*4/10 = 37.3, and so on.
set.seed(57)
df <- data.frame(
state = rep(c("AL", "CA"), each = 12),
year = rep(c(2011:2012), 12),
county = rep(letters[1:6], each = 4),
value = sample(100, 24),
wt = sample(10, 24, replace = T)
) %>% arrange(state, year)
If I apply the following code, the issue is that observation from the same county is also included in the weighted mean formula.
df %>%
group_by(state, year) %>%
mutate(new_val = purrr::map_dbl(row_number(),
~weighted.mean(value[-.x], wt[-.x])))
As a get around, I've tried the following (find weighted mean within a county-year first and apply the code above), but the two are not producing the same results, tho somewhat similar.
df %>%
group_by(state, county, year) %>%
mutate(wp = weighted.mean(value, wt),
wt2 = sum(wt)) %>%
distinct(state, year, county, wp, wt2) %>%
ungroup() %>%
group_by(state, year) %>%
mutate(new_val = purrr::map_dbl(row_number(),
~weighted.mean(wp[-.x], wt2[-.x])))
Thank you for taking the time to read this!
Upvotes: 0
Views: 52
Reputation: 896
I found an answer, but I'm sure that this is not the best approach. Any other suggestions would be very helpful for future reference.
x <- c(rep(c(letters[1:3]), 2), rep(c(letters[4:6]), 2))
year <- rep(rep(c(2011:2012), each = 3), time = 2)
state <- rep(c("AL", "CA"), each = 6)
get_wv <- function(x, year, state){
new_val <- weighted.mean(df$value[df$county != x & df$year == year & df$state == state],
df$wt[df$county != x & df$year == year & df$state == state])
new_val
}
res <- pmap(.l = list(x, year, state), .f = get_wv)
Upvotes: 0