Reputation: 103
I have a data set that that has a date variable that is now in the form of a two digit alphabetic abbreviation for the month followed by the two digit numerical year, such as JA11 for January 2011. How can I convert this to something Stata can recognize as a month/year date?
Data Structure example:
. list id1 ins HPR if _n<20
+--------------------+
| id1 ins HPR |
|--------------------|
1. | 1 AP11 1 YES |
2. | 1 AU11 1 YES |
3. | 1 DE11 1 YES |
4. | 1 FE11 1 YES |
5. | 1 JA11 1 YES |
|--------------------|
6. | 1 JL11 1 YES |
7. | 1 JU11 1 YES |
8. | 1 MA11 1 YES |
9. | 1 MY11 1 YES |
10. | 1 NO11 1 YES |
|--------------------|
11. | 1 OC11 1 YES |
12. | 1 SE11 1 YES |
13. | 2 AP11 2 NO |
14. | 2 AU11 2 NO |
15. | 2 DE11 2 NO |
|--------------------|
16. | 2 FE11 2 NO |
17. | 2 JA11 2 NO |
18. | 2 JL11 2 NO |
19. | 2 JU11 2 NO |
+--------------------+
.
Thank you for your help.
Upvotes: 2
Views: 646
Reputation: 15458
Updated for your sample data:
gen month=substr(hp,1,2)
gen year=substr(hp,3,4)
replace year="2011" if year=="11" **you can omit this in your real data
This part is not required for the data you provide. But you can use in real data
local yr "03 04 05 06 07 08 09 10 11"
local newyr "2003 2004 2004 2006 2007 2008 2009 2010 2011"
local n: word count `yr'
forvalues i=1/`n'{
local yr`i' : word `i' of `yr'
local newyr`i' : word `i' of `newyr'
bys ids: replace year="`newyr`i''" if year=="`yr`i''"
local i=`i'+1
}
local mon "JA FE MA AP MY JU JL AU SE OC NO DE"
local newmon "JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC"
local n: word count `mon'
forvalues i=1/`n'{
local mon`i' : word `i' of `mon'
local newmon`i' : word `i' of `newmon'
bys ids: replace month="`newmon`i''" if month=="`mon`i''"
local i=`i'+1
}
egen datevar=concat(month year),punct(" ")
generate datevar1=date(datevar,"MY")
format datevar1 %d
ids hp month year datevar datevar1
1 AP11 APR 2011 APR 2011 01apr2011
1 AU11 AUG 2011 AUG 2011 01aug2011
1 DE11 DEC 2011 DEC 2011 01dec2011
1 FE11 FEB 2011 FEB 2011 01feb2011
1 JA11 JAN 2011 JAN 2011 01jan2011
1 JL11 JUL 2011 JUL 2011 01jul2011
1 JU11 JUN 2011 JUN 2011 01jun2011
1 MA11 MAR 2011 MAR 2011 01mar2011
1 MY11 MAY 2011 MAY 2011 01may2011
1 NO11 NOV 2011 NOV 2011 01nov2011
1 OC11 OCT 2011 OCT 2011 01oct2011
1 SE11 SEP 2011 SEP 2011 01sep2011
2 AP11 APR 2011 APR 2011 01apr2011
2 AU11 AUG 2011 AUG 2011 01aug2011
2 DE11 DEC 2011 DEC 2011 01dec2011
2 FE11 FEB 2011 FEB 2011 01feb2011
2 JA11 JAN 2011 JAN 2011 01jan2011
2 JL11 JUL 2011 JUL 2011 01jul2011
2 JU11 JUN 2011 JUN 2011 01jun2011
Upvotes: 1
Reputation:
The date() function can convert virtually any date format into elapsed dates, which is the format Stata uses to manipulate date information. Elapsed dates are calculated as the number of days from January 1, 1960. This format is useful for adding or subtracting dates and changing the format of date variables.
. tostring datevar, replace format(%20.0f)
datevar was float now str8
. replace datevar = "0" + datevar if length(datevar) == 7
(2 real changes made)
. list
+----------+
| datevar |
|----------|
1. | 12031999 |
2. | 02081998 |
3. | 04071997 |
+----------+
. gen edatevar = date(datevar,"MDY")
. format edatevar %td
. list
+----------------------+
| datevar edatevar |
|----------------------|
1. | 12031999 03dec1999 |
2. | 02081998 08feb1998 |
3. | 04071997 07apr1997 |
+----------------------+
For more information go to: http://www.stata.com/support/faqs/data-management/handling-date-information/
Upvotes: 2