zazizoma
zazizoma

Reputation: 567

R grouped time series correlations with tidyverse

I want time series correlations in a grouped data frame. Here's a sample dataset:

x <- cbind(expand.grid(type = letters[1:4], time = seq(1:4), kind = letters[5:8]), value = rnorm(64)) %>% arrange(type, time, kind)

which produces 64 rows of the variables type, time, kind and value.

I want a time series correlation of the values for each kind grouped by type. Think of each type and time combination as an ordered vector of 4 values. I group by type and time, then arrange by kind, then remove kind.

y <- x %>% group_by(type) %>% arrange(type, time, kind) %>% select(-kind)

I can then group y by type and time and nest such that all the values are together in the data variable, regroup by type only and create a new variable which is the lead data.

z <- y %>% group_by(type, time) %>% nest(value) %>% group_by(type) %>% mutate(ahead = lead(data))

Now I want to run mutate(R = cor(data, ahead)), but I can't seem get the syntax correct.

I've also tried mutate(R = cor(data$value, ahead$value)) and mutate(R = cor(data[1]$value, ahead[1]$value)), to no avail.

The error I get from cor is: supply both 'x' and 'y' or a matrix-like 'x'.

How do I reference the data and ahead variables as vectors to run with cor?

Ultimately, I'm looking for a 16 row data frame with columns type, time, and R where R is a single correlation value.

Thank you for your attention.

Upvotes: 0

Views: 349

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388862

We can use map2_dbl from purrr to pass data and ahead at the same time to cor function.

library(dplyr)

z %>%
  mutate(R = purrr::map2_dbl(data, ahead, cor)) %>%
  select(-data, -ahead)

#  type   time     R
#  <fct> <int>   <dbl>
# 1 a         1  0.358 
# 2 a         2 -0.0498
# 3 a         3 -0.654 
# 4 a         4  1     
# 5 b         1 -0.730 
# 6 b         2  0.200 
# 7 b         3 -0.928 
# 8 b         4  1     
# 9 c         1  0.358 
#10 c         2  0.485 
#11 c         3 -0.417 
#12 c         4  1     
#13 d         1  0.140 
#14 d         2 -0.448 
#15 d         3 -0.511 
#16 d         4  1     

In base R, we can use mapply

z$R <- mapply(cor, z$data, z$ahead)

Upvotes: 1

Related Questions