Alex Gray
Alex Gray

Reputation: 45

Aggregate an entire data frame with Weighted Mean

I'm trying to aggregate a data frame using the function weighted.mean and continue to get an error. My data looks like this:

dat <- data.frame(date, nWords, v1, v2, v3, v4 ...)

I tried something like:

aggregate(dat, by = list(dat$date), weighted.mean, w = dat$nWords)

but got

 Error in weighted.mean.default(X[[1L]], ...) : 
  'x' and 'w' must have the same length

There is another thread which answers this question using plyr but for only one variable, I want to aggregate all my variables that way.

Upvotes: 1

Views: 1467

Answers (2)

WheresTheAnyKey
WheresTheAnyKey

Reputation: 878

You can do it with data.table:

 library(data.table)

 #set up your data

 dat <- data.frame(date = c("2012-01-01","2012-01-01","2012-01-01","2013-01-01",
 "2013-01-01","2013-01-01","2014-01-01","2014-01-01","2014-01-01"), 
 nwords = 1:9, v1 = rnorm(9), v2 = rnorm(9), v3 = rnorm(9))

 #make it into a data.table

 dat = data.table(dat, key = "date")

 # grab the column names we want, generalized for V1:Vwhatever

 c = colnames(dat)[-c(1,2)]

 #get the weighted mean by date for each column

 for(n in c){
 dat[,
     n := weighted.mean(get(n), nwords),
     with = FALSE,
     by = date]
 }

 #keep only the unique dates and weighted means

 wms = unique(dat[,nwords:=NULL])

Upvotes: 1

Davide Passaretti
Davide Passaretti

Reputation: 2771

Try using by:

# your numeric data
x <- 111:120

# the weights
ww <- 10:1 

mat <- cbind(x, ww)

# the group variable (in your case is 'date')
y <- c(rep("A", 7), rep("B", 3))

by(data=mat, y, weighted.mean)

If you want the results in a data frame, I suggest the plyr package:

plyr::ddply(data.frame(mat), "y", weighted.mean)

Upvotes: 0

Related Questions