Reputation: 1
I have a dataset:
x y
A1 start
A2 ID
A3 delete
A4 start
A5 ID
A6 delete
A7 ID
A8 delete
A9 start
A10 ID
A11 delete
A12 delete
A13 start
A14 ID
A15 start
A16 delete
A17 ID
A18 delete
A19 delete
As you see there is a connection in y column: "delete" comes after "ID". However there are exceptions: in A12 "delete" comes after "delete" and in A16 "delete" comes after "start" and in A19 "delete" comes after "delete". How could i subset only these"deletes" which don't come after "ID". So desired result is:
x y
A12 delete
A16 delete
A19 delete
Upvotes: 1
Views: 48
Reputation: 39657
In base you can subset with a boolean vector indicating when a delete did not come after ID.
z[c(FALSE, z$y[-nrow(z)] != "ID" & z$y[-1] == "delete"),]
# x y
#12 A12 delete
#16 A16 delete
#19 A19 delete
Upvotes: 3
Reputation: 388982
You can use lag
in dplyr
:
library(dplyr)
df %>% filter(y == 'delete' & lag(y) != 'ID')
# x y
#1 A12 delete
#2 A16 delete
#3 A19 delete
and equivalent in data.table
:
library(data.table)
setDT(df)[y == 'delete' & shift(y) != 'ID']
Upvotes: 2