AntVal
AntVal

Reputation: 665

Mutate column based on any lagged value of other column in R

I'm having some issue with what I think should have been a pretty straightforward data transformation task. I have a dataframe that looks something like this:

df
  council_name year treat
1    Southwark 2008     1
2    Southwark 2009     0
3    Southwark 2010     1
4      Lambeth 2006     0
5      Lambeth 2007     1
6      Lambeth 2008     0
7    Yorkshire 2006     0
8    Yorkshire 2007     0
9    Yorkshire 2008     0

I'm trying to get a new variable say pre.post that takes the value 1 if a council has had, at any lower value of year, 1 for treat. Basically I want pre.post == 1 if the council has had at any previous year treat == 1.

This is what I'm looking for:

df.desired
  council_name year treat pre.post
1    Southwark 2008     1        1
2    Southwark 2009     0        1
3    Southwark 2010     1        1
4      Lambeth 2006     0        0
5      Lambeth 2007     1        1
6      Lambeth 2008     0        1
7    Yorkshire 2006     0        0
8    Yorkshire 2007     0        0
9    Yorkshire 2008     0        0

Where basically all council that had at any previous time treat == 1 get pre.post == 1. I tried different things like:

library(dplyr)

df%>%
group_by(council_name)%>%
arrange(year)%>%
mutate(pre.post = ifelse(any(lag(year) = 1), 1, 0))

But nothing seems to get exactly what I'm looking for. Thanks!

Upvotes: 0

Views: 217

Answers (1)

ekoam
ekoam

Reputation: 8844

Equivalently, find the first treatment year and assign 1 to every year after it.

df %>% group_by(council_name) %>% mutate(pre.post = +(year >= min(year[treat == 1])))

Output

# A tibble: 9 x 4
# Groups:   council_name [3]
  council_name  year treat pre.post
  <chr>        <int> <int>    <int>
1 Southwark     2008     1        1
2 Southwark     2009     0        1
3 Southwark     2010     1        1
4 Lambeth       2006     0        0
5 Lambeth       2007     1        1
6 Lambeth       2008     0        1
7 Yorkshire     2006     0        0
8 Yorkshire     2007     0        0
9 Yorkshire     2008     0        0
Warning messages:
1: Problem with `mutate()` input `pre.post`.
i no non-missing arguments to min; returning Inf
i Input `pre.post` is `+(year >= min(year[treat == 1]))`.
i The error occurred in group 3: council_name = "Yorkshire". 
2: In min(year[treat == 1]) :
  no non-missing arguments to min; returning Inf

We get this warning message as we are comparing something to min(integer()), which is set to be Inf. IMO, you can just ignore it as such comparison does not break our logic.

Upvotes: 2

Related Questions