Reputation: 6615
I have a dataset where I want to take both the normal 'mean' and the weighted mean by group. Each group can be considered a different portfolio or stock, price is the price of that portfolio or stock, size is the shares, gain is the return in percentage, and so market cap would be price*size.
Weighted mean would be the gain relative to the market cap by group. I run the code below and it obviously looks wrong but for the life of me I can't figure out what I am missing:
mydf= structure(list(group = structure(c(1L, 2L, 1L, 2L, 1L), .Label = c("a","b"), class = "factor"),
price = c(15, 20, 10, 40, 20), size = c(100, 10, 50, 50, 1000),
gain = c(0.03, 0.02, 0.05, 0.1, 0.01), wt = c(1500, 200, 500, 2000, 20000)),
.Names = c("group", "price", "size", "gain", "wt"), row.names = c(NA, -5L),
class = "data.frame")
mydf
library(plyr)
ddply(x, .(group), summarise,normal_mean= mean(gain),
wt_mean= weighted.mean(x$price*x$size,x$gain))
here I multiplied price and size together, you could also just use wt column in the weighted.mean function or supposedly...
weighted mean does not seem to be calculating over my groups but across all rows. any help?
Upvotes: 0
Views: 235
Reputation: 2989
An approach with dplyr
mydf %>% group_by(group) %>%
summarise (mean=mean(gain), avgwt = weighted.mean(gain,wt))
group mean avgwt
1 a 0.03 0.01227273
2 b 0.06 0.09272727
Upvotes: 1
Reputation: 31161
An approach with data.table
library(data.table)
setDT(mydf)[,list(normalMean=mean(gain),
weightedMean=weighted.mean(gain, wt/sum(wt))),
by = group]
# group normalMean weightedMean
#1: a 0.03 0.01227273
#2: b 0.06 0.09272727
Upvotes: 2