Using aggregate to compute monthly weighted average

Question

I need to compute a monthly weighted average. The data frame looks like this:

            Month Variable Weighting
460773 1998-06-01       11    153.00
337134 1998-06-01        9      0.96
473777 1998-06-01       10    264.00
358226 1998-06-01        6      0.52
414626 1998-06-01       10     34.00
341020 1998-05-01        9      1.64
453066 1998-05-01        5     26.00
183276 1998-05-01        8      0.51
403729 1998-05-01        6    123.00
203005 1998-05-01       11      0.89

When I use aggregate e.g.,

 Output <- aggregate(Variable ~ Month, df , mean )
 Output
       Month Variable
1 1998-05-01      7.8
2 1998-06-01      9.2

I get correct results, however, when I try to add weight to the aggregation e.g.,

Output <- aggregate(Variable ~ Month, df , FUN = weighted.mean, w = df$Weighting)

I get a different-vector-lenghts error:

Error in weighted.mean.default(X[[1L]], ...) : 
'x' and 'w' must have the same length

Is there a way to remedy this situation?

jogo · Accepted Answer

With aggregate() it is not possible, because your weight vector is not partitionated during aggregate(). You can use by() or split() plus sapply() or additional package data.table or function ddply() from package plyr or functions from the package dplyr

example with split() plus sapply():

sapply(split(df, df$Month), function(d) weighted.mean(d$Variable, w = d$Weighting))

result:

1998-05-01 1998-06-01 
   5.89733   10.33142

a variant with by()

by(df, df$Month, FUN=function(d) weighted.mean(d$Variable, w = d$Weighting)) # or
unclass(by(df, df$Month, FUN=function(d) weighted.mean(d$Variable, w = d$Weighting)))

with package plyr

library(plyr)
ddply(df, ~Month, summarize, weighted.mean(Variable, w=Weighting))

with data.table

library(data.table)
setDT(df)[, weighted.mean(Variable, w = Weighting), Month]

Using aggregate to compute monthly weighted average

Answers (2)

Related Questions