Reputation: 119
this is probably a simple question but I could not find a solution even after scouting for Q&A for a quite long time and reading all the cheat-sheets I could find.
Let's say I have the following dataset
participant <- c(1, 1, 2, 2, 3,3 ,4,4)
trial <- c(1, 2, 2, 3, 4, 2, 3, 4)
page <- c(1, 2, 2, 5, 6, 2, 1, 2)
test <- data.frame(participant, trial, page)
I want to remove from my dataset specific trials and/or pages within trials, for specific participants.
So, for example, let's assume I want to remove from my dataset Trial 2 and Page 2 for Participant 1 only.
I tried this, but it removes the participant completely
test <- dplyr::filter(test, participant != "1" & trial != "2" & page != "2")
How can I remove only values in relation to another value? Thanks!
Upvotes: 0
Views: 29
Reputation: 12155
dplyr::filter
only keeps rows for which the provided condition is true. Your thinking was correct that a simple way to do this is to make a conditional statement that matches the row you want to remove, and then invert it to select the other rows. The problem is the way inverting ==
to !=
interacts with the AND operator &
You give the condition participant != "1" & trial != "2" & page != "2"
which is true only if ALL the following conditions are true (since you used &
):
So if a row doesn't meet ANY of those criteria (for example, every row where participant == 1
), it will be removed
Since you want to do is make a conditional statement that matches the rows you want to remove, and then invert it by using the NOT operator !
around the entire statement in parentheses:
dplyr::filter(test, !(participant == 1 & trial == 2 & page == 2))
participant trial page
1 1 1 1
2 2 2 2
3 2 3 5
4 3 4 6
5 3 2 2
6 4 3 1
7 4 4 2
Upvotes: 1