Adrian
Adrian

Reputation: 9793

Subsetting the data and fitting the same model to the subset

I have a data set of 1300 rows, and I would like to divide it up into 10 subsets, each with 130 rows. I want to write a loop that fits a proc nlmixed model to each of the 10 subsets, i.e. I would have 10 models, 1 for each subset.

In R, a simple algorithm should go something like this:

for(i in 1:10){
  data_i <- data[130 * (i - 1) + 1:130 * i, ]
  #fit model to data_i...
  #obtain summary of model fit...
}

In SAS, is there a way to combine subsetting the data (perhaps in a data step) and then calling proc nlmixed within that data step?...

I know I can specify the number of observations with (nobs = ...) or index of first observation with (firstobs = ...) in the proc nlmixed statement. This could work as long as I can iteratively change the argument to firstobs, looping over 1, 131, 261, and so on.

Upvotes: 0

Views: 523

Answers (3)

Shenglin Chen
Shenglin Chen

Reputation: 4554

Try to use macro to combine subsetting the data and proc nlmixed at the same. Something like this:

%macro split_data(data,group, range);
   %do i=1 %to &group;
    %let firstobs=%eval(&range.*(&i-1)+1);
        data data_&i;
           set &data(firstobs=&firstobs);
            j+1;
            if j=&range.+1 then stop;
        run;

    proc nlmixed data=data_&i;
    .....;
    run;
    %end;
%mend;
%split_data(sashelp.class,3,5);

Upvotes: 0

Reeza
Reeza

Reputation: 21274

Use BY group processing instead. Create a grouping variable, if it's random that's easy enough. Then add a BY statement to your NLMIXED proc and it will fit it for each level in your BY variable. Not sure how you want to split your data, assuming it's random? Or you could look at SURVEYSELECT to select 10 random samples.

data heart;
set sashelp.heart;
call streaminit(25);
rand = rand('normal', 100, 20);
run;

proc rank data=heart groups=10 out=heart_rank;
var rand;
ranks groups;
run;

proc sort data=heart_rank;
by groups;
run;

proc nlmixed data=heart_rank;
by groups;
...

This is an expanded analysis, depending on how your groups are created, the first three steps could be a single step, if you were just dividing it into equal groups in order for example this would be trivial and more similar to the R solution.

Upvotes: 1

momo1644
momo1644

Reputation: 1804

You will have to either:

  • Split your data into ten tables,
  • or include your proc nlmix in a macro function and pass the firstobs as parameter.

Split Code example:

data ds1 ds2 ds3 ds4 ds5 ds6 ds7 ds8 ds9 ds10;
set have;
n=130; 
/* Nested if else with increments of 130 records for each table  */
if _N_ <= n*1 then output ds1;
    else if _N_ <= n*2 then output ds2;
        else if _N_ <= n*3 then output ds3;
            else if _N_ <= n*4 then output ds4;
                else if _N_ <= n*5 then output ds5;
                    else if _N_ <= n*6 then output ds6;
                        else if _N_ <= n*7 then output ds7;
                            else if _N_ <= n*8 then output ds8;
                                else if _N_ <= n*9 then output ds9;
                                    else if _N_ <= n*10 then output ds10;
run;

Upvotes: 0

Related Questions