user
user

Reputation: 592

calculated weighted average in r based on two columns

I have a data frame as follows:

date              Rank         new_Weight       c
2019-01-01         20           2               10
2019-01-01         30           5               10 
2019-01-01         10           8               10
2019-02-02          3           10               60
2019-02-02          5            2               60
....               ...          ....

I want to calculate the weighted average based on Rank and new weight I have applied the following code:

by(df, df$date,subset) function(x){
  x<-df$rank*df$new_weight/sum(df$new_weigth)
}

and create a new column.

I wrote the following function and it works very well.

df<- df %>% group_by(date) %>% mutate(w=weighted.mean(rank,new_weight))

however I am wondering why the first function does not work.

Upvotes: 1

Views: 1701

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388807

I think with by what you are trying to do is reference x as dataframe and not df. Also the formula to calculate weighted mean needs to be changed

by(df, df$date, function(x) sum(x$Rank * x$new_Weight)/sum(x$new_Weight))

#df$date: 2019-01-01
#[1] 18
#--------------------------------------------------------------------------------- 
#df$date: 2019-02-02
#[1] 3.333333

which is same as applying weighted.mean

by(df, df$date, function(x) weighted.mean(x$Rank, x$new_Weight))

Upvotes: 3

RPo
RPo

Reputation: 35

Is this sample answer your question ?

 date<-c(2017, 2017, 2018, 2019, 2018, 2019)
 rank<-c(10, 12, 13, 11, 14, 15)
 weight<- c(1.5, 1.1, 1.2, 1.3, 1.4, 1.7)
 df<-data.frame(date, rank, weight)
 df
 df<- df %>% group_by(date) %>% mutate(w=weighted.mean(rank,new_weight))

You don't need any function to do this ;)

Upvotes: 3

Related Questions