Reputation: 958
I am trying to run some moving averages over a dataframe with multiple groups. I am interested in the last SMA over a series of 20 for each group. The second example below crashes because one series (C) only has 10 values. What do I need to do to make this not crash? C needs to be kept in the result. I'm happy for C to be NA in the result.
df <- data.frame(x=c(rep("A", 30), rep("B", 30),rep("C", 10)), y=rnorm(n = 70, 100, 20))
df
ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=10), 1)) # Works because all groups have at least 10 values
ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=20), 1)) # Does not work
Error in runSum(x, n) : n = 20 is outside valid range: [1, 10]
Upvotes: 2
Views: 90
Reputation: 2867
What you want is possibly
from the purrr
library.
library(purrr)
ddply(df, .(x), summarise, SMA10= tail(possibly(SMA, otherwise = NA)(y, n=20), 1))
x SMA10
1 A 101.7075
2 B 91.9557
3 C NA
Upvotes: 1
Reputation: 3914
This happens, because of SMA() function that you use:
library(TTR)
df <- data.frame(x=c(rep("A", 30), rep("B", 30),rep("C", 10)), y=rnorm(n = 70, 100, 20))
SMA(df$y[df$x=="C"], n=20)
#Error in runSum(x, n) : n = 20 is outside valid range: [1, 10]
If you look at the documentation of SMA()
function you will see:
x: Price, volume, etc. series that is coercible to xts or matrix.
n: Number of periods to average over. Must be between 1 and nrow(x), inclusive.
So you first need to make sure your groups have at least n
(n=20
in you case) number of elements.
Depending on what you do, you can use min() function to set n() within SMA() function, i.e.:
ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=min(20,length(y))), 1))
x SMA10
#1 A 92.03348
#2 B 99.68643
#3 C 89.62087
Whether this gives you correct result or not, depends on what you are looking for.
Upvotes: 0