Reputation: 27
I'm generating a column of random numbers in Stata, trying out different seeds to see which one gives the best results. I created 250 random numbers and pasted them into the code of the foreach loop seen below. The ellipsis represents ~240 seeds .
save "`datadir'ProviderCounty", replace
foreach x in 89583 31214 65326 61107 54662 91414 86171 14809 19625 . . . 74397 85273 {
use "`datadir'ProviderCounty", replace
display `x'
set seed `x'
generate rannum = uniform()
. . .
}
I'd like to replace that long line of 250 numbers by reading them from Excel into a matrix like this and then iterating through the matrix one by one.
* Import seeds randomly generated in Excel
clear
import excel "`datadirIN'Random Number Seeds.xlsx", sheet("Sheet1") cellrange(A2:A252) firstrow
mkmat Seeds, matrix(matSeeds)
scalar mlen = rowsof(matSeeds)
clear
This would go between the line starting with "save . . ." and the foreach line. What I don't know how to do is iterate through the matrix. I need a line that replaces the foreach line with a line(s) that the iterates through matrix and places the seed number in the macro "x".
Upvotes: 1
Views: 1472
Reputation: 37208
I hope I am misunderstanding, but the underlying idea appears fallacious. The sole merit of using a specified seed is to ensure reproducibility of detailed results, in the sense that other people using the same program and the same data are at least assured of the same results (and hence have means of checking exactly what you did). Otherwise if results depend sensitively on a particular seed, then either the sample size is too small, or the problem is too fragile for any results to be credible. How are you going to report this? If you suppress the fact that you had to search for suitable results, then that would widely be regarded as unacceptable. If you publicise the fact, you publicise results that are stamped as between dubious and useless. I would advise discussing your idea with supervisors, mentors or colleagues as appropriate. If they're suggesting this, explaining why you think it is a good idea would be needed wherever you present the results.
All that said, Stata matrices have rows and columns so that given a column vector, its elements are generically matname[
i, 1]
. Subscripting is defined in any documentation on Stata matrices e.g. http://www.stata.com/help.cgi?matrix
So the loop you seem to be implying might be
mkmat Seeds, matrix(matSeeds)
forval i = 1/`= rowsof(matSeeds)' {
...
set seed `= matSeeds[`i', 1]'
...
}
http://www.stata.com/help.cgi?macro documents evaluations on the fly of (in this case) scalars and matrix elements.
EDIT: The syntax used here is documented at help macro
or at http://www.stata.com/manuals14/pmacro.pdf. Here is an example:
. mat foo = J(1, 1, 42)
. set seed `=foo[1,1]'
. display c(seed)
X51535c3ec43f462544a474abacbdd93d386b
. mat foo = J(1, 1, 666)
. set seed `=foo[1,1]'
. display c(seed)
X97b5c5aec43f462544a474abacbdd93d2d9c
The underlying problem here is that set seed
will not itself evaluate expressions fed to it. There are various work-arounds including defining a local macro and then typing a macro reference. The way that Stata works is that the macro is evaluated before set
sees its arguments. The syntax shown here cuts out the macro by an evaluation on the fly.
The case used here is that where the expansion_optr is an equals sign =
followed by exp, namely an expression to be evaluated. In this case, the expression is just a matrix element.
Upvotes: 6