k119919
k119919

Reputation: 1

Time variable for a panel dataset in Stata

How do I create a new custom week variable in Stata for a panel dataset? My date needs to start from week2 of 2019 (i.e. 2019w2) and end in week 27 for 2022 (i.e. 2022w27).

For e.g. 2019w2 to 2022w27 for id1 and 2019w2 to 2022w27 for id2 ...etc.

I tried using the following code, but this created weekyr variable that ran continuously from 2019w2 onwards for id1 and id2. Therefore, what should be 2019w2 for id2 is 2022w28 instead.

gen weekyr=tw(2019w2)+_n-1
format weekyr %tw

Upvotes: 0

Views: 316

Answers (1)

Nick Cox
Nick Cox

Reputation: 37278

ORIGINAL ANSWER, NOW SUPERSEDED

If (and only if) there were 52 weeks in 2019 and 2020 and 2021 by the definition of week you are using AND the data are balanced AND the data are already in the right sort order then you can ride on the back of Stata's definition of week.

gen long obsno = _n 

bysort id (obsno) : gen week = tw(2019w1) + _n 

format week %tw

If any of those conditions does not hold, then you need to say what definition of a week you are using, for example does it begin or end or a certain day of the week.

EDIT Demonstration that code produces an increasing sequence of dates:

clear 
set obs 30
egen id = seq(), block(10)
gen long obsno = _n
bysort id (obsno) : gen week = tw(2019w1) + _n 

format week %tw 

list 
     +----------------------+
     | id   obsno      week |
     |----------------------|
  1. |  1       1    2019w2 |
  2. |  1       2    2019w3 |
  3. |  1       3    2019w4 |
  4. |  1       4    2019w5 |
  5. |  1       5    2019w6 |
     |----------------------|
  6. |  1       6    2019w7 |
  7. |  1       7    2019w8 |
  8. |  1       8    2019w9 |
  9. |  1       9   2019w10 |
 10. |  1      10   2019w11 |
     |----------------------|
 11. |  2      11    2019w2 |
 12. |  2      12    2019w3 |
 13. |  2      13    2019w4 |
 14. |  2      14    2019w5 |
 15. |  2      15    2019w6 |
     |----------------------|
 16. |  2      16    2019w7 |
 17. |  2      17    2019w8 |
 18. |  2      18    2019w9 |
 19. |  2      19   2019w10 |
 20. |  2      20   2019w11 |
     |----------------------|
 21. |  3      21    2019w2 |
 22. |  3      22    2019w3 |
 23. |  3      23    2019w4 |
 24. |  3      24    2019w5 |
 25. |  3      25    2019w6 |
     |----------------------|
 26. |  3      26    2019w7 |
 27. |  3      27    2019w8 |
 28. |  3      28    2019w9 |
 29. |  3      29   2019w10 |
 30. |  3      30   2019w11 |
     +----------------------+

EDIT 2 No definition of week has been supplied, so we need to make this a puzzle to be solved. A little exploration shows that there were 53 Wednesdays and Thursdays in 2020, and either each Wednesday or each Thursday fits as defining a week.

clear 
set obs 181
gen date = mdy(1,2,2019) + 7 * _n
gen year = year(date)
gen work = 1
expand 3 
sort work year date 
egen id = seq(), to(3)
drop work 
bysort id year (date) : gen week = cond(year == 2019, _n + 1, _n) 
egen wdate = group(year week), label

l if id == 1 & (inrange(week, 1, 3) | inrange(week, 51, 53)), sepby(year)
     +------------------------------------+
     |  date   year   id   week     wdate |
     |------------------------------------|
  1. | 21558   2019    1      2    2019 2 |
  2. | 21565   2019    1      3    2019 3 |
 50. | 21901   2019    1     51   2019 51 |
 51. | 21908   2019    1     52   2019 52 |
     |------------------------------------|
 52. | 21915   2020    1      1    2020 1 |
 53. | 21922   2020    1      2    2020 2 |
 54. | 21929   2020    1      3    2020 3 |
102. | 22265   2020    1     51   2020 51 |
103. | 22272   2020    1     52   2020 52 |
104. | 22279   2020    1     53   2020 53 |
     |------------------------------------|
105. | 22286   2021    1      1    2021 1 |
106. | 22293   2021    1      2    2021 2 |
107. | 22300   2021    1      3    2021 3 |
155. | 22636   2021    1     51   2021 51 |
156. | 22643   2021    1     52   2021 52 |
     |------------------------------------|
157. | 22650   2022    1      1    2022 1 |
158. | 22657   2022    1      2    2022 2 |
159. | 22664   2022    1      3    2022 3 |
     +------------------------------------+

This suggests a route for the OP:

gen long obsno = _n 
bysort id (obsno) : gen date = mdy(1,2,2019) + 7 * _n
gen year = year(date)
bysort id year (date) : gen week = cond(year == 2019, _n + 1, _n) 
egen wdate = group(year week), label

Upvotes: 0

Related Questions