Reputation: 33
I have a data set and want to run a correlation between X
and Y
. However, I only want to look at X
values that are greater than 1.
cor(Data$X, Data$Y, use = "complete.obs")
What argument do I add to run a correlation between X
and Y
only for the X
values that are greater than 1?
Upvotes: 0
Views: 55
Reputation: 300
You can subset using the [
operator.
Try this:
# Generate Example Data
Data <- data.frame(X = seq(-5, 10, 1),
Y = sample(1:100, 16))
with(data = Data[Data$X > 1, ], cor(X, Y, use = "complete.obs"))
[
lets us specify rows and columns in the style my.data.frame[rows, columns]
. Here we are specifying that we want only rows where X > 1
, but all columns. We could also do the following to ask for each column individually by name:
cor(Data[Data$X > 1, "X"], Data[Data$X > 1, "Y"], use = "complete.obs"))
Or even the following to subset the column vectors:
cor(Data$X[Data$X > 1], Data$Y[Data$X > 1], use = "complete.obs"))
Of course, these are only to illustrate the flexibility. It's best to subset the whole data set once to avoid discrepancies.
Upvotes: 2