user7295926
user7295926

Reputation:

Apply autocorrelation function acf() to elements of set of vectors by group in a data frame

I have a data frame DF which looks like this:

 ID      Area time
1  1 182.685    1
2  2 182.714    1
3  3 182.275    1
4  4 211.928    1
5  5 218.804    1
6  6 183.445    1
...
1  1 184.334    2
2  2 196.765    2
3  3 186.435    2
4  4 213.322    2
5  5 214.766    2
6  6 172.667    2

.. and so to ID = 6. I want to apply an autocorrelation function on each ID, i.e. compare ID = 1 at time 1 with ID = 1 at time 2 and so on.

What is the most straightforward way to apply e.g. acf() to my data frame?

When I try to use

autocorr = aggregate(x = DF$Area, by = list(DF$ID), FUN = acf)

I get a weird object.

Thanks in advance!

Upvotes: 0

Views: 1803

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73265

I want to apply an autocorrelation function on each ID

OK, good, so you don't want any cross-correlation, which make things much easier.

I get a weird object

acf returns a bunch of things, i.e., it returns a list of things. I think you will be only interested in ACF values, so you need:

FUN = function (u) c(acf(u, plot = FALSE)$acf)

Also, using aggregate is not a good idea. You may want split and sapply:

## so your data frame is called `x`
oo <- sapply(split(x$Area, x$ID), FUN = function (u) c(acf(u, plot = FALSE)$acf) )

If you have balanced data, i.e., if you have equal number of observations for each ID, oo will be simplified into a matrix for sure. If you do not have balanced data, you may want to explicitly control the lag.max argument in acf. By default, acf will auto-decide on this value based on the number of observations.

Now suppose we want lag 0 to lag 7, we can set:

oo <- sapply(split(x$Area, x$ID),
             FUN = function (u) c(acf(u, plot = FALSE, lag.max = 7)$acf) )

Thus result oo is a matrix of 8 rows (row for lag, column for ID). I don't see any good of using a data frame to hold this result, but in case you want a data frame, simply do:

data.frame(oo)

With data either in a matrix or a data frame, it is easy for you to do further analysis.

-----------

For a complete description of acf, please read Produce a boxplot for multiple ACFs

Upvotes: 2

Related Questions