takeITeasy
takeITeasy

Reputation: 360

correlation between groups in R

df <- data.frame(row.names = c('S.5.0U0','S.6.0U1','S.7.0U2','S.8.0U3'),vara=c(-1.2,15,8.5,0),varb=c(-29,29,2.6,5),var1=c(-0.5,1.5,58,0),var2=c(-2.09,-12,2.6,-0.75),var3 = c(0,0.056,-12,5.5))
> df
        vara  varb var1   var2  var3
S.5.0U0 -1.2 -29.0 -0.5  -2.09  0
S.6.0U1 15.0  29.0  1.5 -12.00  0.056
S.7.0U2  8.5   2.6 58.0   2.60  -12
S.8.0U3  0.0   5.0  0.0  -0.75  5.5

I want to correlate vara with var1,var2,var3 and I want to correlate varb with var1,var2,var3.

I tried this...

ab <- subset(df,select = c(`vara`,`varb`))
other <- subset(df,select = c(`var1`,`var2`,`var3`))

for(n in 1/length(other)){
  n
  for(t in 1/length(ab){
    t
    corr <- broom::tidy(cor.test(n,t))
  }
}

Error in cor.test.default(nutrient, taxa) : 
  not enough finite observations

...and that

apply(df[ ,c(1:2)],2, function(x) cor.test(x, df[ ,c(3:5)]) )

It is not working. I have seen Correlation between multiple variables of a data frame The problem is, that in my real data the two groups I want to correlate are great, so I really need something like a loop or apply.

Thank you

EDIT: specification of the problem:

I want to use cor.test because I want to obtain the correlation coefficient and as well the p-value, in a list. When I just use cor.test(df) This error occurs 'x' and 'y' must have the same length

Upvotes: 1

Views: 674

Answers (2)

pieterbons
pieterbons

Reputation: 1724

You can use pivot_longer() to make a long list of all the combinations of variables that you want to correlate with each other. Then you can use group_by() to calculate the p-value and estimate of the correlation between all combinations:

library(tidyr)
library(dplyr) 

df %>% 
  pivot_longer(names_to = "variable_right", values_to = "value_right", var1:var3) %>% 
  pivot_longer(names_to = "variable_left", values_to = "value_left", vara:varb) %>% 
  group_by(variable_left, variable_right) %>% 
  summarise(p.value = cor.test(value_left, value_right)$p.value,
         estimate = cor.test(value_left, value_right)$estimate)

Upvotes: 2

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

try it this way

base

cor(df)[1, 3:5]
cor(df)[2, 3:5]

or

cor(df)[1:2, 3:5]

tidyverse

library(tidyverse)
map_dbl(df[3:5], ~ cor(df$vara, .x))
map_dbl(df[3:5], ~ cor(df$varb, .x))

Upvotes: 2

Related Questions