bluepole
bluepole

Reputation: 345

Compute the correlation by looping through factor levels in a dataframe

Here is the data structure in R:

library(MASS)
ns <- 10; nt <- 20
dat <- data.frame(
          Subj  = rep(c(paste0('S',1:ns), paste0('S',1:ns)), nt),
          F     = rep(c(rep('f1', ns), rep('f2',ns)), nt),
          T     = rep(paste0('t', 1:nt), each=2*ns),
          y     = c(mvrnorm(n=ns, mu=c(0, 0), Sigma=matrix(c(1,0.7,0.7,1), nrow=2,ncol=2)))
                  +rnorm(2*ns*nt, 0, 1) )

I want to compute the correlation of the variable y between the two levels (f1 and f2) of the factor F separately for each level of the factor Subj. This should end up with 10 correlations in this example. One more condition is that the sequence for each of the two vectors in the correlation formula should be arranged in the same order per the levels of the factor T.

How to achieve this? Thanks!

Upvotes: 0

Views: 49

Answers (1)

ekoam
ekoam

Reputation: 8844

You can use by in base R.

subdat <- dat[order(dat$T), c("y", "F", "Subj")]
by(subdat, subdat$Subj, function(x) with(x, cor(y[F == "f1"], y[F == "f2"])))

Output

subdat$Subj: S1
[1] -0.03755675
--------------------------------------------------------------------------------- 
subdat$Subj: S10
[1] -0.05481364
--------------------------------------------------------------------------------- 
subdat$Subj: S2
[1] 0.2822211
--------------------------------------------------------------------------------- 
subdat$Subj: S3
[1] 0.2671967
--------------------------------------------------------------------------------- 
subdat$Subj: S4
[1] 0.1268404
--------------------------------------------------------------------------------- 
subdat$Subj: S5
[1] 0.0374699
--------------------------------------------------------------------------------- 
subdat$Subj: S6
[1] 0.5655247
--------------------------------------------------------------------------------- 
subdat$Subj: S7
[1] 0.2141196
--------------------------------------------------------------------------------- 
subdat$Subj: S8
[1] 0.250178
--------------------------------------------------------------------------------- 
subdat$Subj: S9
[1] 0.1370734

Upvotes: 1

Related Questions