Reputation: 592
I have a data frame as follows:
date Rank new_Weight c
2019-01-01 20 2 10
2019-01-01 30 5 10
2019-01-01 10 8 10
2019-02-02 3 10 60
2019-02-02 5 2 60
.... ... ....
I want to calculate the weighted average based on Rank and new weight I have applied the following code:
by(df, df$date,subset) function(x){
x<-df$rank*df$new_weight/sum(df$new_weigth)
}
and create a new column.
I wrote the following function and it works very well.
df<- df %>% group_by(date) %>% mutate(w=weighted.mean(rank,new_weight))
however I am wondering why the first function does not work.
Upvotes: 1
Views: 1701
Reputation: 388807
I think with by
what you are trying to do is reference x
as dataframe and not df
. Also the formula to calculate weighted mean needs to be changed
by(df, df$date, function(x) sum(x$Rank * x$new_Weight)/sum(x$new_Weight))
#df$date: 2019-01-01
#[1] 18
#---------------------------------------------------------------------------------
#df$date: 2019-02-02
#[1] 3.333333
which is same as applying weighted.mean
by(df, df$date, function(x) weighted.mean(x$Rank, x$new_Weight))
Upvotes: 3
Reputation: 35
Is this sample answer your question ?
date<-c(2017, 2017, 2018, 2019, 2018, 2019)
rank<-c(10, 12, 13, 11, 14, 15)
weight<- c(1.5, 1.1, 1.2, 1.3, 1.4, 1.7)
df<-data.frame(date, rank, weight)
df
df<- df %>% group_by(date) %>% mutate(w=weighted.mean(rank,new_weight))
You don't need any function to do this ;)
Upvotes: 3