mountain_view
mountain_view

Reputation: 11

R for loop to compute correlation between one x and multiple y variables

I am trying to compute the correlation of one x variable with multiple y variables. I am using a for loop similar to this code:

df <- data.frame(x = rnorm(100),
                 var1 = rnorm(100),
                 var2 = rnorm(100),
                 var3 = rnorm(100))

for (y in grep("var", colnames(df), value = TRUE)) {
  summarise(df, cor(x, y))
}

Receiving the following error message:

Error in `summarise()`:
! Problem while computing `..1 = cor(x, y)`.
Caused by error in `cor()`:
! 'y' must be numeric

My guess is, that the "y" in the correlation-function is not being interpreted as a variable name. Does someone have any hints on how to fix this?

Upvotes: 0

Views: 98

Answers (1)

stefan
stefan

Reputation: 123783

Using dplyr::across you could do:

set.seed(123)

df <- data.frame(x = rnorm(100),
                 var1 = rnorm(100),
                 var2 = rnorm(100),
                 var3 = rnorm(100))

library(dplyr)

summarise(df, across(!x, ~ cor(x, .x)))
#>          var1      var2      var3
#> 1 -0.04953215 -0.129176 -0.044079

Upvotes: 1

Related Questions