Reputation: 13
I would like to create from some start and end dates dummy variables which take value 1 if in the range. For example, from
id start end
1 01072014 05072014
1 05012014 06012015
I would like to get
id start end d_01012014 d_02012014 d_03012014 ... d_01052014 ... d_31122014
1 01012014 02012014 1 1 0 0 0
1 01052014 02052015 0 0 0 1 0
So that I eventually can reshape long my data, dropping all observations out of dayrange. My idea was to use a loop with stata date format, somethin like this:
foreach i in *stataformat startdate*/*stataformat enddate* {
generate d_`i'=1 if `i'>=start & `i'<=end
}
But the problem from this method is that my variables would alle have incomprensible names. So do you either suggest another approach, or have an idea how to rename variables containing stata datecodes to 'understandable' names? Thanks a lot!
Upvotes: 0
Views: 2297
Reputation: 2694
If I wanted to do this from first principle I would start with long format data:
clear
input id spell str10 start str10 end
1 1 "01-07-2014" "05-07-2014"
1 2 "06-08-2014" "06-01-2015"
end
gen start2 = date(start, "MDY")
gen end2 = date(end, "MDY")
format start2 %td
format end2 %td
sum start2
local min = r(min)
sum end2
local range = r(max) - `min' + 1
expand `range'
bys id spell : gen date = `min' + _n - 1
format date %td
keep if date >= start2 & date <= end2
However, since this is probably survival analysis data, and you already stset
the dataset (or you are going to), you can just use stsplit
.
Upvotes: 5