Reputation: 579
Here is an example of my data:
id score
1 82 0.50000
2 82 0.39286
3 82 0.56250
4 328 0.50000
5 328 0.67647
6 328 0.93750
7 328 0.91667
I want to make a column of moving average's of scores for each id.
So I need to somehow group the data by id then apply a MA function to that grouped data and then have the output as another column "MA_score"
I would like my output to look like this:
id score MA_score
1 82 0.50000 NULL
2 82 0.39286 0.xxxx
3 82 0.56250 NULL
4 328 0.50000 NULL
5 328 0.67647 0.yyyy
6 328 0.93750 0.qqqq
7 328 0.91667 NULL
Upvotes: 1
Views: 1573
Reputation: 5308
You could split your data by unique ID values, calculate the rolling mean (from 'zoo' package) for each of these unique IDs and append the results to your initial dataframe:
# Required packages
library(zoo)
# Data setup
df <- data.frame(id = c(82, 82, 82, 328, 328, 328, 328),
score = c(0.5, 0.39286, 0.5625, 0.5, 0.67647, 0.9375, 0.91667))
# Split data by unique IDs
df.sp <- split(df, df$id)
# Calculate rolling mean for each unique ID
df.ma <- lapply(seq(df.sp), function(i) {
rollmean(df.sp[[i]]$score, k = 3, na.pad = TRUE)
})
# Append column 'MA_score' to dataframe
for (i in seq(names(df.sp))) {
df[which(df$id == names(df.sp)[i]), "MA_score"] <- df.ma[[i]]
}
df
id score MA_score
1 82 0.50000 NA
2 82 0.39286 0.4851200
3 82 0.56250 NA
4 328 0.50000 NA
5 328 0.67647 0.7046567
6 328 0.93750 0.8435467
7 328 0.91667 NA
Upvotes: 2
Reputation: 59970
You could use split and rollapply from the zoo package as one of many ways to approach this. Note that in the example below I set the width of the rollapply function to 1 so it just returns each value. For widths greater than one it will take the mean of that number of values.
require(zoo)
sapply( split( df , df$id) , function(x) rollapply( x , width = 1 , align = 'left' , mean) )
#Note that by setting width = 1 we just return the value
$`82`
id score
[1,] 82 0.50000
[2,] 82 0.39286
[3,] 82 0.56250
$`328`
id score
[1,] 328 0.50000
[2,] 328 0.67647
[3,] 328 0.93750
[4,] 328 0.91667
If we were to set width = 3
you would get:
$`82`
id score
[1,] 82 0.48512
$`328`
id score
[1,] 328 0.7046567
[2,] 328 0.8435467
Or you could use aggregate in base
R:
aggregate( score ~ id , data = df , function(x) rollapply( x , width = 1 , align = 'left' , mean) )
id score
1 82 0.50000, 0.39286, 0.56250
2 328 0.50000, 0.67647, 0.93750, 0.91667
There are quite a few ways to do this. I would precisely define your moving average function though, because there are many ways to calculate it (check out for example TTR:::SMA
)
Or more straightforward using ave
:
within(df, { MA_score <- ave(score, id, FUN=function(x)
rollmean(x, k=3, na.pad = TRUE))})
Upvotes: 4