Reputation: 495
I am looking to create a variable (COUNT_WITHIN_2_DAYS
) that is the count of observations in a date range. The range would be the stated date +/- 2 days.
ID# DATE COUNT_WITHIN_2_DAYS
1 1/1/2000 3
2 1/2/2000 4
5 1/2/2000 4
6 1/4/2000 3
9 1/12/2000 1
11 3/1/2001 2
12 3/3/2000 4
16 3/4/2000 3
18 3/5/2000 3
21 3/8/2000 1
I tried an egen
with an if
command but that did not work
Upvotes: 0
Views: 1107
Reputation: 11102
This solution involves looping over all observations and uses the fact that you can compute with dates as you would with integers, after giving the date variable a date format. (These dates are just integers. See e.g. help datetime
for details).
Every observation is compared with the one observation that is fixed by the loop. ind == 1
for those observations in which the absolute difference is <= 2
. Afterwards, I use summarize
to add-up
the number of such observations. Finally, the result of the sum is assigned to the fixed observation using the variable counter
. The procedure is repeated until each observation has been fixed.
clear all
set more off
*-------------- example data ----------------
input ///
id str15 date
1 "1/1/2000"
2 "1/2/2000"
5 "1/2/2000"
6 "1/4/2000"
9 "1/12/2000"
11 "3/1/2001"
12 "3/3/2000"
16 "3/4/2000"
18 "3/5/2000"
21 "3/8/2000"
end
gen date2 = date(date, "MDY")
format date2 %td
drop date
rename date2 date
sort date
list
*------------ what you want ------------------
gen counter = .
gen ind = .
forvalues i = 1/`=_N' {
replace ind = abs(date - date[`i']) <= 2
summarize ind, meanonly
replace counter = r(sum) in `i'
}
drop ind
list
Upvotes: 2