Reputation:
I have a data frame DF
which looks like this:
ID Area time
1 1 182.685 1
2 2 182.714 1
3 3 182.275 1
4 4 211.928 1
5 5 218.804 1
6 6 183.445 1
...
1 1 184.334 2
2 2 196.765 2
3 3 186.435 2
4 4 213.322 2
5 5 214.766 2
6 6 172.667 2
.. and so to ID = 6
. I want to apply an autocorrelation function on each ID, i.e. compare ID = 1
at time 1 with ID = 1
at time 2 and so on.
What is the most straightforward way to apply e.g. acf()
to my data frame?
When I try to use
autocorr = aggregate(x = DF$Area, by = list(DF$ID), FUN = acf)
I get a weird object.
Thanks in advance!
Upvotes: 0
Views: 1803
Reputation: 73265
I want to apply an autocorrelation function on each ID
OK, good, so you don't want any cross-correlation, which make things much easier.
I get a weird object
acf
returns a bunch of things, i.e., it returns a list of things. I think you will be only interested in ACF values, so you need:
FUN = function (u) c(acf(u, plot = FALSE)$acf)
Also, using aggregate
is not a good idea. You may want split
and sapply
:
## so your data frame is called `x`
oo <- sapply(split(x$Area, x$ID), FUN = function (u) c(acf(u, plot = FALSE)$acf) )
If you have balanced data, i.e., if you have equal number of observations for each ID
, oo
will be simplified into a matrix for sure. If you do not have balanced data, you may want to explicitly control the lag.max
argument in acf
. By default, acf
will auto-decide on this value based on the number of observations.
Now suppose we want lag 0 to lag 7, we can set:
oo <- sapply(split(x$Area, x$ID),
FUN = function (u) c(acf(u, plot = FALSE, lag.max = 7)$acf) )
Thus result oo
is a matrix of 8 rows (row for lag, column for ID
). I don't see any good of using a data frame to hold this result, but in case you want a data frame, simply do:
data.frame(oo)
With data either in a matrix or a data frame, it is easy for you to do further analysis.
For a complete description of acf
, please read Produce a boxplot for multiple ACFs
Upvotes: 2