Comparing Current Against Previous Values

Question

Back again with a simple issue on paper but struggling with the implementation. For context this is looking at suspects and victims and what we want to achieve is if the current victim is the same as the last victim. If so the suspects latest victim is different from the last that is a entry we would want to flag and keep. If they are the same we would remove the record.

So the comparison is:

Suspect A on Date 1 relates to Victim 1 = Suspect A on Date 2 relates to Victim 1 = Drop

Suspect B on Date 1 relates to Victim 2 = Suspect B on Date 2 relates to Victim 3 = Keep

Date	Suspect	Victim
15/01/2022	A	1
12/03/2022	A	1
19/02/2022	B	2
16/01/2022	B	3
08/03/2022	B	4
20/03/2022	B	5
25/01/2022	C	5
21/02/2022	D	6
10/01/2022	D	7

Assume this is my current data set. In this context 'Suspect' should only have two entries B and D while A and C are removed.

I was thinking of a doing an arrange of date and Suspect. Then lagging the comparison. But how does lag work if jumping suspects. Can that be solved with a group variable? This is where I am stuck conceptualising it and fear removing things that should be included.

any help, as always, is greatly appreciated.

r2evans · Accepted Answer

Try this:

dat %>%
  group_by(Suspect) %>%
  filter(n() > 1 & Victim != last(Victim))
# # A tibble: 4 x 3
# # Groups:   Suspect [2]
#   Date       Suspect Victim
#             
# 1 19/02/2022 B            2
# 2 16/01/2022 B            3
# 3 08/03/2022 B            4
# 4 21/02/2022 D            6

Comparing Current Against Previous Values

Answers (1)

Related Questions