Reputation: 1424
Title might be misleading.
I have a longitudinal dataset with a dummy (dummy1
) variable indicating if a condition is met in a certain year, for given category
. I want this event to be taken into account for the next twenty years as well. Hence, I want to create a new dummy (dummy2
), which takes the value 1 for the 19 observations following the observation where dummy1
was 1, as well as that same observation (example below).
I was trying to create a loop with lag operators, but failed to get it to work so far.
Upvotes: 0
Views: 696
Reputation: 37208
Even code that failed might be close to a good solution. Not giving code that failed means that we can't explain your mistakes. Furthermore, questions focusing on how to use software to do something are widely considered marginal or off-topic on SO.
One approach is
bysort category (year) : gen previous = year if dummy1
by category : replace previous = previous[_n-1] if missing(previous)
gen byte dummy2 = (year - previous) < 20
The trick here is to create a variable holding the last year
that the dummy (indicator) was 1, and the trick in that is spelled out in How can I replace missing values with previous or following nonmissing values or within sequences?
Note that this works independently of
whether the panel identifier is numeric (it could be string here, on the evidence given)
whether you have tsset
or xtset
the data
what happens before the first event; for such years, previous
is born missing and remains missing (however, in general, watch for problems with code at the ends of time series).
Upvotes: 2