Reputation: 145
I would like to correlate two variables and have the output reported separately for levels of a third variable.
My data are similar to this example:
var1 <- c(7, 8, 9, 10, 11, 12)
var2 <- c(18, 17, 16, 15, 14, 13)
categories <- c(1, 2, 3, 1, 2, 3)
And I want to correlate var1 with var2 within the categories, such that the results would show the correlation of the values of var1 and var2 for category 1 separately from category 2 and category 3.
In SAS, I would do:
PROC CORR DATA=x;
BY CATEGORY
VAR VAR1
WITH VAR2;
RUN;
Upvotes: 2
Views: 1256
Reputation: 887118
You could also use by
sapply(by(cbind(var1, var2), categories, FUN=cor),`[`,2)
#1 2 3
#-1 -1 -1
Upvotes: 0
Reputation: 206232
You can put your records into a data.frame and then split by the cateogies and then run the correlation for each of the categories.
sapply(
split(data.frame(var1, var2), categories),
function(x) cor(x[[1]],x[[2]])
)
This can look prettier with the dplyr
library
library(dplyr)
data.frame(var1=var1, var2=var2, categories=categories) %>%
group_by(categories) %>%
summarize(cor= cor(var1, var2))
Upvotes: 1