Reputation: 43
I started to learn %macro
in SAS and now I'm trying to implement simple bootstrap with histogram as an output.
/*Create K data sets(vectors)*/
%macro datasets(K);
%do i=1 %to &K;
data indata&i;
%do j = 1 %to 50;
x=(rand('normal',2,9));
output;
%end;
run;
%end;
%mend datasets;
%datasets(3);
/*Bootstrap and hist*/
%macro boot (data,res);
%do i=1 %to &res;
%let x = (sample(&data,50));
%let m = (mean(&x));
%end;
proc iml;
read &m into A;
create DataM from A;
append from A;
close Data1;
quit;
proc univariate data=Data1;
histogram m;
run;
%mend boot;
%boot(Indata1,100);
It doesn't work and I can't understand why. Can you point me the mistake?
Upvotes: 0
Views: 1150
Reputation: 51601
Perhaps it will help if we outline some of the ways that the posted macro code does NOT work. If nothing else then as examples of things to avoid.
If the first macro , %datasets()
, you are using a macro %DO
loop where you should use a normal data step DO
loop. Also make sure to define your local macro variables as local. This will prevent the macro from modifying the value of an existing macro variable with the same name.
/*Create K data sets(vectors)*/
%macro datasets(K);
%local i ;
%do i=1 %to &K;
data indata&i;
do j = 1 to 50;
x=(rand('normal',2,9));
output;
end;
drop j;
run;
%end;
%mend datasets;
In the second macro you have a %DO
loop that does nothing.
%do i=1 %to &res;
%let x = (sample(&data,50));
%let m = (mean(&x));
%end;
You repeat the exact same %LET
statements multiple times. The result does not change since the loop variable i
is not referenced at all. If you called the macro with data=indata1
then the result of the two statements will be that X=(sample(indata1,50))
and that M=(mean((sample(indata1,50))))
. I think that perhaps you intended that the strings sample
and mean
might take some action, but since they have no macro triggers (&
or %
) they are just streams of characters to the macro processor.
I am not an expert on IML, but those statements also do not look like they are doing much.
Upvotes: 0
Reputation: 21274
As mentioned use Proc SurveySelect and Proc Means. You can select all 100 samples in one Proc SurveySelect and then apply Proc Means with a BY statement to calculate the means in one step. Macro's don't add anything to the solution here.
I'm posting both solutions - the macro solution does take longer as well.
*Without macro;
proc surveyselect data=indata1 out=rsample method=srs n=50 reps=100;
run;
proc means data=rsample noprint;
by replicate;
var x;
output out=Data1 mean(x)=m;
run;
proc univariate data=Data1;
histogram m;
run;
*Macro solution;
%macro boot(data, res);
%do i=1 %to &res;
%*Currently pulls the same sample every time but you can fix that part;
proc surveyselect data=&data out=x method=srs n=50 reps=1 seed=343434;
run;
proc means data=x noprint;
var x;
output out=m mean(x)=m;
run;
proc append base=DataM data=m;
run;
%end;
%mend;
%boot(Indata1,10);
Upvotes: 0
Reputation: 9109
Use PROC SURVEYSELECT to generate bootstrap samples then do bootstrap analysis by Replication (a variable created by SURVEYSELECT). Your macro idea will be far too slow.
Upvotes: 1