grouped weighted mean in r (preferably with ddply but whatever works)

Question

I have a dataset where I want to take both the normal 'mean' and the weighted mean by group. Each group can be considered a different portfolio or stock, price is the price of that portfolio or stock, size is the shares, gain is the return in percentage, and so market cap would be price*size.

Weighted mean would be the gain relative to the market cap by group. I run the code below and it obviously looks wrong but for the life of me I can't figure out what I am missing:

mydf= structure(list(group = structure(c(1L, 2L, 1L, 2L, 1L), .Label = c("a","b"), class = "factor"), 
                     price = c(15, 20, 10, 40, 20), size = c(100, 10, 50, 50, 1000), 
                     gain = c(0.03, 0.02, 0.05, 0.1, 0.01), wt = c(1500, 200, 500, 2000, 20000)), 
                .Names = c("group", "price", "size", "gain", "wt"), row.names = c(NA, -5L), 
                class = "data.frame")
mydf
library(plyr)
ddply(x, .(group), summarise,normal_mean= mean(gain), 
      wt_mean= weighted.mean(x$price*x$size,x$gain))

here I multiplied price and size together, you could also just use wt column in the weighted.mean function or supposedly...

weighted mean does not seem to be calculating over my groups but across all rows. any help?

Colonel Beauvel · Accepted Answer

An approach with data.table

library(data.table)

setDT(mydf)[,list(normalMean=mean(gain), 
             weightedMean=weighted.mean(gain, wt/sum(wt))), 
             by = group]

#   group normalMean weightedMean
#1:     a       0.03   0.01227273
#2:     b       0.06   0.09272727

grouped weighted mean in r (preferably with ddply but whatever works)

Answers (2)

Related Questions