Reputation: 4917
I have sample data.set containing climate data for different seasons:
df <- data.frame(season=rep(1:5,2),year=rep(1:2,each=5),
temp=c(2,4,3,5,2,4,1,5,4,3),ppt=c(4,3,1,5,6,2,1,2,2,2),
samples=c(22,25,24,31,31,29,28,31,30,32))
I can determine the mean of my climate variables for each season for each year simply:
aggregate(df[,c('temp','ppt')], by = list(df$season,df$year), function(x) mean(x,na.rm=T))
However, I want to determine the weighted mean of each season|year combo using variable samples
as my weights.
Essentially I want to replace my mean
function in aggregate()
with weighted.mean
. That would require adding a second argument to my function that needs to change with my x
.
function(x,w) weighted.mean(x,w,na.rm=T))
Though, I'm not sure how to let the weight argument ('w') of weighted.mean()
vary with each subset of the aggregated data.
Can I do this all within an aggregate
function?
Any advice would be great!
Upvotes: 0
Views: 1794
Reputation: 28441
Try summarise_each
from dplyr
. It allows for the prior grouping with group_by
and application to multiple columns:
library(dplyr)
df %>% group_by(season, year) %>%
summarise_each(funs(weighted.mean(., samples,na.rm=T)), temp,ppt)
# Source: local data frame [10 x 5]
# Groups: season, year [10]
#
# season year temp ppt samples
# (int) (int) (dbl) (dbl) (dbl)
# 1 1 1 2 4 22
# 2 2 1 4 3 25
# 3 3 1 3 1 24
# 4 4 1 5 5 31
# 5 5 1 2 6 31
# 6 1 2 4 2 29
# 7 2 2 1 1 28
# 8 3 2 5 2 31
# 9 4 2 4 2 30
# 10 5 2 3 2 32
Upvotes: 3