Reputation: 89
I have data where the 'Law' variable indicates changes in legislation, in different places ('Place'):
Person Place Year Law
1 A 1990 0
2 A 1991 1
3 A 1992 1
4 B 1990 0
5 B 1991 0
6 B 1992 1
7 B 1993 1
8 B 1993 1
9 B 1993 1
10 B 1992 1
Basically the law was implemented in place A in 1991 and remained in force for all subsequent time periods. It was implemented in place B in 1992 and remained in force, & so on.
I would like to create a new variable that takes on a value of 0 for the year the law was implemented, 1 for 1 year after, 2 for 2 years after, -1 for the year before, -2 for 2 years before, and so on.
I need the final dataframe to look like:
Person Place Year Law timeline
1 A 1990 0 -1
2 A 1991 1 0
3 A 1992 1 1
4 B 1990 0 -2
5 B 1991 0 -1
6 B 1992 1 0
7 B 1993 1 1
8 B 1993 1 2
9 B 1993 1 2
10 B 1992 1 1
I have tried:
library(dplyr)
df %>%
group_by(Place) %>%
arrange(Year) %>%
mutate(timeline = rank(Law))
but it's not working like I need. What am I doing wrong? Can I do this in dplyr or do I need to create a complex for loop?
Upvotes: 3
Views: 171
Reputation: 79288
using data.table
library(data.table)
setDT(dat)[,timeline:=sequence(.N)-which.min(!Law),by=Place]
dat
Person Place Year Law timeline
1: 1 A 1990 0 -1
2: 2 A 1991 1 0
3: 3 A 1992 1 1
4: 4 B 1990 0 -2
5: 5 B 1991 0 -1
6: 6 B 1992 1 0
7: 7 B 1993 1 1
Using base r:
transform(dat,timeline=ave(Law,Place,FUN=function(x)1:length(x)-which.min(!x)))
Person Place Year Law timeline
1 1 A 1990 0 -1
2 2 A 1991 1 0
3 3 A 1992 1 1
4 4 B 1990 0 -2
5 5 B 1991 0 -1
6 6 B 1992 1 0
7 7 B 1993 1 1
Upvotes: 0
Reputation: 215047
You can subtract the row_numer
by the index where the Law
is implemented:
df %>%
arrange(Year) %>%
group_by(Place) %>%
mutate(timeline = row_number() - which(diff(Law) == 1) - 1) %>%
arrange(Place)
# A tibble: 7 x 5
# Groups: Place [2]
# Person Place Year Law timeline
# <int> <fct> <int> <int> <dbl>
#1 1 A 1990 0 -1.
#2 2 A 1991 1 0.
#3 3 A 1992 1 1.
#4 4 B 1990 0 -2.
#5 5 B 1991 0 -1.
#6 6 B 1992 1 0.
#7 7 B 1993 1 1.
Upvotes: 1