Reputation: 1
How do I create a new custom week variable in Stata for a panel dataset? My date needs to start from week2 of 2019 (i.e. 2019w2) and end in week 27 for 2022 (i.e. 2022w27).
For e.g. 2019w2 to 2022w27 for id1 and 2019w2 to 2022w27 for id2 ...etc.
I tried using the following code, but this created weekyr
variable that ran continuously from 2019w2 onwards for id1 and id2. Therefore, what should be 2019w2 for id2 is 2022w28 instead.
gen weekyr=tw(2019w2)+_n-1
format weekyr %tw
Upvotes: 0
Views: 316
Reputation: 37278
ORIGINAL ANSWER, NOW SUPERSEDED
If (and only if) there were 52 weeks in 2019 and 2020 and 2021 by the definition of week you are using AND the data are balanced AND the data are already in the right sort order then you can ride on the back of Stata's definition of week.
gen long obsno = _n
bysort id (obsno) : gen week = tw(2019w1) + _n
format week %tw
If any of those conditions does not hold, then you need to say what definition of a week you are using, for example does it begin or end or a certain day of the week.
EDIT Demonstration that code produces an increasing sequence of dates:
clear
set obs 30
egen id = seq(), block(10)
gen long obsno = _n
bysort id (obsno) : gen week = tw(2019w1) + _n
format week %tw
list
+----------------------+
| id obsno week |
|----------------------|
1. | 1 1 2019w2 |
2. | 1 2 2019w3 |
3. | 1 3 2019w4 |
4. | 1 4 2019w5 |
5. | 1 5 2019w6 |
|----------------------|
6. | 1 6 2019w7 |
7. | 1 7 2019w8 |
8. | 1 8 2019w9 |
9. | 1 9 2019w10 |
10. | 1 10 2019w11 |
|----------------------|
11. | 2 11 2019w2 |
12. | 2 12 2019w3 |
13. | 2 13 2019w4 |
14. | 2 14 2019w5 |
15. | 2 15 2019w6 |
|----------------------|
16. | 2 16 2019w7 |
17. | 2 17 2019w8 |
18. | 2 18 2019w9 |
19. | 2 19 2019w10 |
20. | 2 20 2019w11 |
|----------------------|
21. | 3 21 2019w2 |
22. | 3 22 2019w3 |
23. | 3 23 2019w4 |
24. | 3 24 2019w5 |
25. | 3 25 2019w6 |
|----------------------|
26. | 3 26 2019w7 |
27. | 3 27 2019w8 |
28. | 3 28 2019w9 |
29. | 3 29 2019w10 |
30. | 3 30 2019w11 |
+----------------------+
EDIT 2 No definition of week has been supplied, so we need to make this a puzzle to be solved. A little exploration shows that there were 53 Wednesdays and Thursdays in 2020, and either each Wednesday or each Thursday fits as defining a week.
clear
set obs 181
gen date = mdy(1,2,2019) + 7 * _n
gen year = year(date)
gen work = 1
expand 3
sort work year date
egen id = seq(), to(3)
drop work
bysort id year (date) : gen week = cond(year == 2019, _n + 1, _n)
egen wdate = group(year week), label
l if id == 1 & (inrange(week, 1, 3) | inrange(week, 51, 53)), sepby(year)
+------------------------------------+
| date year id week wdate |
|------------------------------------|
1. | 21558 2019 1 2 2019 2 |
2. | 21565 2019 1 3 2019 3 |
50. | 21901 2019 1 51 2019 51 |
51. | 21908 2019 1 52 2019 52 |
|------------------------------------|
52. | 21915 2020 1 1 2020 1 |
53. | 21922 2020 1 2 2020 2 |
54. | 21929 2020 1 3 2020 3 |
102. | 22265 2020 1 51 2020 51 |
103. | 22272 2020 1 52 2020 52 |
104. | 22279 2020 1 53 2020 53 |
|------------------------------------|
105. | 22286 2021 1 1 2021 1 |
106. | 22293 2021 1 2 2021 2 |
107. | 22300 2021 1 3 2021 3 |
155. | 22636 2021 1 51 2021 51 |
156. | 22643 2021 1 52 2021 52 |
|------------------------------------|
157. | 22650 2022 1 1 2022 1 |
158. | 22657 2022 1 2 2022 2 |
159. | 22664 2022 1 3 2022 3 |
+------------------------------------+
This suggests a route for the OP:
gen long obsno = _n
bysort id (obsno) : gen date = mdy(1,2,2019) + 7 * _n
gen year = year(date)
bysort id year (date) : gen week = cond(year == 2019, _n + 1, _n)
egen wdate = group(year week), label
Upvotes: 0