Reputation: 357
I have created a matrix in R and I want to investigate the correlation between two columns. My_matrix is:
speed motor rpm acceleration age
cadillac 3 42 67 22
porche 5 40 68 21
ferrari 7 37 69 20
peugeot 10 32 70 19
kia 12 28 71 18
when I try the cor(speed~age, data=My_matrix)
I get the following error:
Error in cor(speed ~ age, data = a) : unused argument (data = My_matrix)
any idea how I can address this? Thanks.
Upvotes: 1
Views: 353
Reputation: 437
There are some great base R solutions on here already (hats off to @akrun & @Debutant, base R is great!). I would like to add alternate solutions for future viewers and code preference options.
If you don't like typing quote marks and the dataset is small enough, column numbers can be faster--although variable names in quotations is better for accuracy (especially if the columns are reordered).
@mikey in the comments offered a column number solution, here is an alternate version:
cor(My_matrix[,c(1,4)])
If your data is a dataframe instead of a matrix, you might enjoy a tidyverse approach, which also does not require quotation marks (although pesky variables with spaces in their names may require ` marks):
library(dplyr)
My_dataframe %>% select(speed, age) %>% cor()
@Debutant only asked for 2 variables for the correlation but if we wanted to go all out and get the full correlation matrix available, here are additional options:
# assuming all your columns are numeric as they are here
cor(My_matrix)
# if you have a dataframe with different data types, select only the numeric ones
library(dplyr)
My_dataframe %>% select_if(is.numeric) %>% cor()
# if you don't like the long decimals, toss in a round() for good measure
My_dataframe %>% select_if(is.numeric) %>% cor() %>% round(3)
Hope you find this useful. :)
Upvotes: 0
Reputation: 357
I also tried this and it worked: I created a "b" dataset
b=as.data.frame(My_matrix)
then I used the
cor(b$speed, b$age) and got the correlation.
Upvotes: 1
Reputation: 886938
We can subset the columns and apply the cor
directly as the usage of cor
is
cor(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman"))
and there is no formula method
cor(My_matrix[,c("speed", "age")])
# speed age
#speed 1.0000000 -0.9971765
#age -0.9971765 1.0000000
Upvotes: 1