Corr between every nth variable

Question

I have a long list of multiple columns representing different variables over time. I am trying to run a correlation between each of the three time points e.g.

cor(df1[,c(7,36,65)], use = "p")
cor(df1[,c(8,37,66)], use = "p")
cor(df1[,c(9,38,67)], use = "p")

This is time consuming and I want to be able to run this if I add / remove columns in the near future. As you can see, it obviously follows a pattern and I have tried achieving this using apply:

apply(df1[,c(7:93)], 2, function(x) corr(df1[,c(x, x+29, x+58)], use = "p"))

I've also tried a for loop:

for (i in 7:93) {
  cor(df1[, c(i,i+29,i+58)], use = "p")
}

Obviously I am making mistakes in my writing of both of these. I know there has to be an easy way to do this that I am missing!

Ronak Shah · Accepted Answer

We can use mapply for selecting the columns in parallel.

mapply(function(x, y, z) cor(df1[,c(x, y, z)], use = "p"), 7:35, 36:64, 65:93)

Or taking your attempt ahead another solution similar to that of @akrun's could be

sapply(7:35, function(x) cor(df1[,c(x, x+29, x+58)], use = "p"))

Corr between every nth variable

Answers (1)

Related Questions