Reputation: 11
I would like to remove all rows that are followed by the same entry in R. I have the following column in a data frame:
"FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "FAILURE" "HIT" "FAILURE" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "FAILURE" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "FAILURE" "FAILURE" "HIT"
I would like to delete all rows in which the "FAILURE" entry is followed by another "FAILURE" entry. So I would like to get the following column of the data frame back:
"FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT"
How can I check if the next row meets that condition and remove those rows?
Upvotes: 0
Views: 61
Reputation: 39647
All which had a hit with FAILURE
and their diff
is 0
are removed with:
. <- x == "FAILURE"
(. <- x[!(. & diff(c(FALSE, .)) == 0)])
# [1] "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE"
# [8] "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT"
#[15] "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE"
#[22] "HIT" "FAILURE" "HIT" "FAILURE" "HIT" "FAILURE" "HIT"
identical(., y)
#[1] TRUE
Data:
x <- c("FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "FAILURE", "HIT", "FAILURE", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "FAILURE", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "FAILURE", "FAILURE", "HIT")
y <- c("FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT", "FAILURE", "HIT")
Upvotes: 1