Reputation: 529
I would like to randomly pick 3
years in a subset of my data, say between 2007
to 2016
, except 2008
, 2012
and 2014
. I want to repeat this process 500
times.
How can I do this simulation while meeting the required conditions?
Note that this is a follow up question to a previous post of mine where I was offered a solution for an unconditional case.
Upvotes: 1
Views: 164
Reputation: 599
The sample
command might be a good choice for this issue:
generate tag = !inlist(year, 2008, 2012, 2014)
keep if tag
expandcl 500, generate(ex) cluster(tag)
set seed 582019
sample 3, by(ex) count
Upvotes: 0
Reputation:
The easiest way is to first subset your data:
sysuse uslifeexp, clear
set seed 12345
// preserve
keep if year >= 1946 & year <=1957
drop if inlist(year, 1948, 1952, 1954)
tempname sim
postfile `sim' id year1 year2 year3 using results, replace
forvalues i = 1 / 500 {
generate random = runiform()
sort random
post `sim' (`i') (year[1]) (year[2]) (year[3])
drop random
}
postclose `sim'
// restore
Note the commented out preserve
/ restore
commands, which can keep your data intact in case you do not want to only have the reduced dataset after the simulation.
As before the results are stored in a new dataset result
:
use results, clear
list in 1/10
+----------------------------+
| id year1 year2 year3 |
|----------------------------|
1. | 1 1955 1953 1946 |
2. | 2 1953 1946 1949 |
3. | 3 1949 1953 1946 |
4. | 4 1949 1957 1956 |
5. | 5 1946 1951 1950 |
|----------------------------|
6. | 6 1953 1946 1951 |
7. | 7 1957 1947 1946 |
8. | 8 1949 1957 1947 |
9. | 9 1947 1956 1949 |
10. | 10 1953 1949 1957 |
+----------------------------+
Upvotes: 1