Banjo
Banjo

Reputation: 1251

Weighted mean over different rows

I have a data frame with three variables like this one:

variable <- c("A", "B", "C", "B", "B", "A", "C")
mean <- c(2,4,5,4,3,1,5)
counts <- c(100, 200, 300, 150, 400, 200,250)

df <- data.frame(variable, mean, counts)

I want to multiply the mean with the number of counts and add them per variable. Then I want to divide the result by the number of counts per variable. The result should look like this example for the variable A:

(2 * 100 + 1 * 200)/300 = 1.333333

Then I want to aggregate the result per variable like in the group_by and summarise function in dplyr. Any Idea how this works? I tried it with group_by and mutate but there was no aggregation. The result should look like this one:

Variable     Mean
A         some value
B         some value
C         some value

Upvotes: 1

Views: 125

Answers (2)

tushaR
tushaR

Reputation: 3116

library(plyr)
ddply(.data = df,.variables = c('variable'),
    function(x){sum(x$mean*x$counts)/sum(x$counts)})

Upvotes: 2

erocoar
erocoar

Reputation: 5923

This would work - you just have to specify the calculation because a call to mean will of course divide by the count (e.g. 2 for the first one) library(tidyverse)

df %>% 
  mutate(multiple = mean * counts) %>% 
  group_by(variable) %>% 
  summarise(mean = sum(multiple) / sum(counts))

Upvotes: 3

Related Questions