Reputation: 641
I'm a beginner to R and am having trouble indexing into a dataframe using a vector of column values.
I want to select all the rows from 2 participants.
data is the data frame. participant is a column
data[data$participant == c(8, 10),])
I thought this should give me all the rows from both participants 8 and 10, but instead it is giving me half of the rows from participant 8 and half from participant 10. In other words,
dim(data[data$participant == c(8, 10),])
is the same as dim(data[data$participant == 8,])
or dim(data[data$participant == 10,])
rather than double.
The problem seems to be with the syntax of indexing these multiple column types:
data$participant == c(8, 10)
I'd be grateful for any tips on how to do this (without doing each participant separately)! Thank you!
Upvotes: 4
Views: 5377
Reputation: 887901
For multiple values, use %in%
to get a logical vector.
data[data$participant %in% c(8, 10),]
When we are using ==
with c(8,10)
, it is recycling the 8 and 10 i.e. 8,10, 8, 10, 8, 10... etc to the length of 'participant' column. So, if the 1st value in participant is 8, it will return TRUE, but if the 2nd is 8, it will become FALSE as the corresponding element will be 10.
Upvotes: 5