Reputation: 27
I'm trying to correlate timeseries data within site pairs. I'm working with a data set that looks like this:
A= Site, B= Year and C= Abundance:
A B C
[1,] 1 2002 21
[2,] 1 2004 25
[3,] 1 2005 26
[4,] 2 3003 24
[5,] 2 2004 20
[6,] 2 2005 20
[7,] 3 2002 21
[8,] 3 2003 22
[9,] 3 2004 23
[10,] 3 2005 25
I want to split the data by column A to test correlation between each site pair (site 1 with 2 and 3, site 2 with 1 and 3 etc.)
I've tried: mydata.cor = cor(dat, method = c("spearman"))
But this just correlates the columns:
A B C
A 1.0000000 -0.1287697 -0.1684834
B -0.1287697 1.0000000 0.4151682
C -0.1684834 0.4151682 1.0000000
Is there a way specify grouping values, in this case site category?
Upvotes: 1
Views: 254
Reputation: 333
one way to do it would be to transform your data to wide format
and then use cor()
as you intended to (on the relevant columns).
Here's my reproducible code, I assumed that the 3003 in line 4 was a typo ...
library(data.table)
mat <- matrix(c(
1, 2002, 21,
1, 2004, 25,
1, 2005, 26,
2, 2003, 24,
2, 2004, 20,
2, 2005, 20,
3, 2002, 21,
3, 2003, 22,
3, 2004, 23,
3, 2005, 25),10,3,byrow = TRUE)
DT <- data.table(mat)
names(DT) <- c("A","B","C")
DT_wide <- dcast(DT, B~A)
cor(DT_wide[,-1], method="spearman", use="pairwise.complete.obs")
Upvotes: 2