statler
statler

Reputation: 1381

R - correlation between column subsets - reference current row

I have a set of data such as;

       name      Exp1Res1   Exp1Res2   Exp1Res3   ExpRes1   Exp2Res2   Exp3Res3

[1]     ID1         5          7            9          7          9       2 

[2]     ID2         6          4            2          9          5       1

[3]     ID3         4          9            9          9          11      2

I need to determine the correlation between experiment 1 and 2 for each row. As there are actually 37 columns and 100,000 rows in my dataset (FullSet), my original solution of looping through is far too slow (refer below), so I wanted to optimize.

My original solution was;

df <- data.frame(matrix(ncol = 5, nrow = dim(FullSet)[1]))
names(df)<-c("ID","pearson","spearman")
for (i in  seq(1, dim(FullSet)[1]))  
{
    pears=cor(as.numeric(t(FullSet[i,2:19])),as.numeric(t(FullSet[i,20:37])), method="pearson")
    spear=cor(as.numeric(t(FullSet[i,2:19])),as.numeric(t(FullSet[i,20:37])), method="pearson")
    df[i,]<-c(FullSet[i,1],pears,spear)
}

I feel something like this should work;

FullSet$pearson<-cor(as.numeric(t(FullSet[,2:19])),as.numeric(t(FullSet[,20:37])), method="pearson")

but I don't know if/how to reference just the current row in the transpose -

 t(FullSet[,2:19]) - which should read something like t(FullSet[<currow>,2:19]). 

Help would be appreciated - I don't know if my approach is even correct.

Output should look like (Results are not correct - for example only)

       name      Pearson     Spearman

[1]     ID1         0.8          .75 

[2]     ID2         0.9          .8

[3]     ID3         0.85         .7

Upvotes: 1

Views: 538

Answers (1)

Seb
Seb

Reputation: 5497

what about bringing it to the format:

ID  EXP  Res
1    1    .
1    1    .
1    2    .
1    2    .

by using reshape and then letting plyr do the work:

require(plyr)
ddply(df, .(ID, EXP), summarize, cor(...))

would that be a possibility? if you do it for spearman and for perason seperately.

Upvotes: 4

Related Questions