Tom Wenseleers
Tom Wenseleers

Reputation: 8019

R: construct data frame with all pairwise correlations & significance levels between numeric variables

To be able to get the pairwise correlations and significance levels between all numeric variables in a data frame, I wrote the following little function:

corwithsign=function(df,type="pearson") {
  df=df[,sapply(df, is.numeric)] # only keep numeric variables in data frame
  vars=names(df)
  nvars=length(vars)
  nvals=(nvars*nvars-nvars)/2 # number of pairwise correlations between the variables
  vars1=vars2=cors=pvals=n=vector("numeric",nvals) # make empty vectors to store results
  row=1 # row of output table
  for (v1 in (1:(nvars-1))) {
    for (v2 in ((v1+1):nvars)) {
      var1=vars[[v1]]; var2=vars[[v2]]
      vars1[[row]]=var1; vars2[[row]]=var2
      out=cor.test(df[,var1],df[,var2],use="pairwise.complete.obs",method=type)
      cors[[row]]=out$estimate
      pvals[[row]]=out$p.value
      n[[row]]=out$parameter+2 # df + 2
      row=row+1
    }
  }
  data.frame(cbind(var1=vars1,var2=vars2,r=cors,p=pvals,n),row.names=NULL)
}

corwithsign(mtcars,type="pearson")
   var1 var2                   r                    p  n
1   mpg  cyl  -0.852161959426613 6.11268714258096e-10 31
2   mpg disp  -0.847551379262479  9.3803265373813e-10 31
3   mpg   hp  -0.776168371826586 1.78783525412106e-07 31
4   mpg drat   0.681171907806749 1.77623992874132e-05 31
5   mpg   wt  -0.867659376517228 1.29395870135052e-10 31
6   mpg qsec   0.418684033921778   0.0170819884965197 31
7   mpg   vs   0.664038919127593 3.41593725443623e-05 31
8   mpg   am   0.599832429454648 0.000285020743935105 31
9   mpg gear   0.480284757338842  0.00540094822470749 31
10  mpg carb  -0.550925073902459  0.00108444622049168 31
...

I was just wondering if there is perhaps any shorter and more elegant way to do this, or if this type of functionality was perhaps already implemented in some packages? (I saw some references to rcorr in Hmisc, but that outputs two matrices, which is no good for me, as I just want a dataframe outputted).

Any thoughts?

cheers, Tom

Upvotes: 0

Views: 1209

Answers (1)

Tom Wenseleers
Tom Wenseleers

Reputation: 8019

As mentioned above, the psych library has a nice corr.test function which gives more than just normal base cor.test, in particular

corr.test(mtcars)$ci

is pretty close to what corwithsign does

Upvotes: 2

Related Questions