CTD
CTD

Reputation: 43

How to write a function that calculates the correlation between variables

I need to write a function that contains three arguments: dat - name of data frame, mainVar - character vector used to calculate correlation between the 3rd variable, and varlist - character vector that contains one or more values.

The function will return a data frame that contains the correlation coefficient and the corresponding p-value between each pair.

An example of what I'm looking to achieve:

 myCortest (chol, "wt", "age")
     var1  var2          R            p
 age   wt   age  0.6660014 5.631448e-26

What I have so far:

myCortest <- function(dat, mainVar, varlist){
result <- data.frame()
for (i in 1:length(mainVar)){
foo <- cor.test(dat$mainvar, dat$varlist)
r <- data.frame(Varname = mainVar[i],
R <- as.vector(foo$estimate[1]),
P <- foo$p.value)
result <- rbind(result, r)
}
return(result)
}

My code won't run so I know I'm doing something wrong. How can I achieve my desired output?

Upvotes: 2

Views: 639

Answers (1)

NelsonGon
NelsonGon

Reputation: 13319

If I got the aim right, here is a sa(i)mple function:

myCortest <- function(dat, mainVar, varlist){

foo<-lapply(varlist,function(x){foo1<-cor.test(get(mainVar,as.environment(dat)), 
                       get(x,as.environment(dat)))
            data.frame(Var1=mainVar,Var2=x,
                       p.value=foo1$p.value,R.Sq=foo1$estimate)
            })

foo
}

Test it:

myCortest (iris, "Sepal.Length", c("Petal.Length","Sepal.Width"))

Output:

[[1]]
            Var1         Var2      p.value      R.Sq
cor Sepal.Length Petal.Length 1.038667e-47 0.8717538

[[2]]
            Var1        Var2   p.value       R.Sq
cor Sepal.Length Sepal.Width 0.1518983 -0.1175698

Upvotes: 1

Related Questions