Reputation: 161
I've been searching online for how to plot a histogram of values stored in a matrix, but I am having some trouble doing so. I have created a forval
loop, where I have stored p-values for 1000 trials of a test, and I want to now plot these p-values on a histogram.
/* Loop generating 1000 trials and storing p-values */
mata: pvalue1000 = J(1000,1,.)
forvalues i = 1/1000 {
clear
quiet set obs 1000
gen n = _n
quiet gen A = runiform()
quiet ttest A = 0.20
/*store the mean, in a local variable*/
local pvalue = r(p)
gen pval = r(p)
/*transfer the p-value from the "local" to the matrix */
mata: pvalue1000[`i',1] = `pvalue'
}
mata: pvalue1000
hist pvalue1000
The hist pvalue1000
in this case, is saying that pvalue1000
is not found, and when I try to do hist pval
it just only displays one p-value in the histogram (I am assuming this is because it is outside the loop).
Also note, that the matrix is only storing p-values and all the p-values are stored in a single column (which has 1000 rows). So the matrix is of size 1 column and 1000 rows.
So how would I be able to call a variable with hist
, where it will plot all of the p-values on this histogram?
Upvotes: 0
Views: 837
Reputation: 2665
Stata's main dataset, matrices that you access using matrix
command and Mata matrices all live separately and need separate functions to deal with, but you can transfer the data between all three.
In your case, you want to load a Mata matrix into the Stata dataset, which you can do as follows:
clear
getmata pvalue1000, double
Please not that your p-values are very small, therefore you need to use double
option. Otherwise you'll get zeros with single precision.
Upvotes: 0
Reputation: 37208
histogram
expects a variable name, and you are first feeding it a matrix name, so no go there, as matrices and variables are utterly different in Stata.
Conversely, when you then feed it a variable name, your variable pval
contains only the single and last P-value put in it, as all previous incarnations of pval
were clear
ed out of the way by your own code. (Putting the histogram command inside the loop would have no useful effect here, as at best there is only one P-value inside the variable at a time.)
Matrices can be very useful, but they are at best indirect for this purpose.
Presumably your problem is not your real problem. If you have samples of size 1000 from a uniform on (0, 1), then sample means will all be close to 0.5 and P-values of a test that the mean is 0.2 will all be practically indistinguishable from 0 and no histogram is interesting or useful. But this code seems to capture your intent:
clear
set obs 1000
gen A = .
gen pval = .
quietly forval i = 1/1000 {
replace A = runiform()
ttest A = 0.20
replace pval = r(p) in `i'
}
hist pval
What's not in this code:
Putting results in locals and/or matrices and/or taking them out again is not needed for any purpose. We put them directly into a variable one by one, because that is the result needed.
The observation numbers _n
are not used for anything, so they seem dispensable too, although naturally they may be needed for your real problem.
Your comment store the mean
is not matched by any code that you try.
Note also that talking about local
s as variables is natural for anyone familiar with other programming languages, but in no sense is it Stata terminology. Locals are local macros, not variables.
Upvotes: 1