Luc
Luc

Reputation: 958

R ddply catch errors

I am trying to run some moving averages over a dataframe with multiple groups. I am interested in the last SMA over a series of 20 for each group. The second example below crashes because one series (C) only has 10 values. What do I need to do to make this not crash? C needs to be kept in the result. I'm happy for C to be NA in the result.

df <- data.frame(x=c(rep("A", 30), rep("B", 30),rep("C", 10)), y=rnorm(n = 70, 100, 20))
df

ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=10), 1)) # Works because all groups have at least 10 values

ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=20), 1)) # Does not work
Error in runSum(x, n) : n = 20 is outside valid range: [1, 10]

Upvotes: 2

Views: 90

Answers (2)

DSGym
DSGym

Reputation: 2867

What you want is possibly from the purrr library.

library(purrr)

ddply(df, .(x), summarise, SMA10= tail(possibly(SMA, otherwise = NA)(y, n=20), 1))


  x    SMA10
1 A 101.7075
2 B  91.9557
3 C       NA

Upvotes: 1

Katia
Katia

Reputation: 3914

This happens, because of SMA() function that you use:

library(TTR)

df <- data.frame(x=c(rep("A", 30), rep("B", 30),rep("C", 10)), y=rnorm(n = 70, 100, 20))
SMA(df$y[df$x=="C"], n=20)
#Error in runSum(x, n) : n = 20 is outside valid range: [1, 10]

If you look at the documentation of SMA() function you will see:

x: Price, volume, etc. series that is coercible to xts or matrix.

n: Number of periods to average over. Must be between 1 and nrow(x), inclusive.

So you first need to make sure your groups have at least n (n=20 in you case) number of elements.

Depending on what you do, you can use min() function to set n() within SMA() function, i.e.:

ddply(df, .(x), summarise, SMA10= tail(SMA(y, n=min(20,length(y))), 1))
   x    SMA10
#1 A 92.03348
#2 B 99.68643
#3 C 89.62087

Whether this gives you correct result or not, depends on what you are looking for.

Upvotes: 0

Related Questions