Reputation: 915
I have id
and D
. D
is either 1 or 0.
I want to calculate "Consecutive Ones" for each id
.
Consecutive Ones calculates the number of consecutive 1s in D
, in each id
.
id year D CO
1 1990 1 1
1 1991 1 2
1 1992 0 0
1 1993 0 0
1 1994 1 1
1 1995 0 0
1 1996 1 1
1 1997 1 2
2 1990 1 1
2 1991 0 0
2 1992 0 0
2 1993 1 1
2 1994 1 2
2 1995 1 3
I made a running sum in the hope that this made be a stepping stone.
bysort id (year): gen runningsumD=sum(D)
Then I also tried
bysort id (year): replace CO=D[_n-1]+D if D!=0
But this again didn't give me what I wanted.
Upvotes: 0
Views: 972
Reputation: 37358
There is now substantial discussion on similar problems in Stata both on Statalist and in the Stata Journal. It helps to know a few keywords for search
, such as that for you spells or runs of interest are defined by consecutive values of 1.
The condition for a spell to start in this question is thus that the value of interest is 1 and that the previous value was 0 or it's the start of a panel. (The second possibility is easy to overlook in coding.) That joint condition gives you an indicator variable which is 1 at the start of a spell and O otherwise. Then what you want is to bump up that indicator while observations are in the same spell.
Here's sample code and results with your data example:
clear
input id year D CO
1 1990 1 1
1 1991 1 2
1 1992 0 0
1 1993 0 0
1 1994 1 1
1 1995 0 0
1 1996 1 1
1 1997 1 2
2 1990 1 1
2 1991 0 0
2 1992 0 0
2 1993 1 1
2 1994 1 2
2 1995 1 3
end
bysort id (year) : gen wanted = D == 1 & (_n == 1 | D[_n-1] == 0)
by id: replace wanted = wanted[_n-1] + 1 if D == 1 & wanted == 0
list, sepby(id)
+-----------------------------+
| id year D CO wanted |
|-----------------------------|
1. | 1 1990 1 1 1 |
2. | 1 1991 1 2 2 |
3. | 1 1992 0 0 0 |
4. | 1 1993 0 0 0 |
5. | 1 1994 1 1 1 |
6. | 1 1995 0 0 0 |
7. | 1 1996 1 1 1 |
8. | 1 1997 1 2 2 |
|-----------------------------|
9. | 2 1990 1 1 1 |
10. | 2 1991 0 0 0 |
11. | 2 1992 0 0 0 |
12. | 2 1993 1 1 1 |
13. | 2 1994 1 2 2 |
14. | 2 1995 1 3 3 |
+-----------------------------+
A reading and program list might include
SJ-15-1 dm0079 . . . . . . . . . . . . . . . Stata tip 123: Spell boundaries
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/15 SJ 15(1):319--323 (no commands)
shows how to identify spells
SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: Identifying spells
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q2/07 SJ 7(2):249--265 (no commands)
shows how to handle spells with complete control over
spell specification
.pdf of last mentioned freely available at http://www.stata-journal.com/sjpdf.html?articlenum=dm0029
.pdf of first mentioned will become freely available on publication of Stata Journal 18(1).
tsspell
(SSC) is a basic tool using the principles described in the 2007 paper just cited. tsspell
thus gives you an otherwise unpredictable search term for searching Statalist discussions.
https://www.stata.com/support/faqs/data-management/identifying-runs-of-consecutive-observations/ is also relevant for related problems.
Upvotes: 2