Eyal Marom
Eyal Marom

Reputation: 301

how can I split a big data set to small tables in sas

I have a large data set of branches and accounts. I would like to split the data set into to smaller tables by the variable BRANCH. Is there a way to do so, even by PROC TABULATE or PROC REPORT ?

My code:

PROC SQL ;
    CREATE TABLE Branch_trans as
    SELECT  Branch,
            account_id
    FROM work.BRANCH
;
QUIT ;

Upvotes: 1

Views: 1544

Answers (3)

imnotarobot
imnotarobot

Reputation: 141

maybe it help for you but more then your question. This soultion split data by branch so you can modify report by branch if do you want:

    /* this is an example table, the branch 3 has 2 row*/
    data fulldata;
    branchid=1; a="aaa";output;
    branchid=2; a="bbb";output;
    branchid=3; a="ccc";output;
    branchid=3; a="ddd";output;
    run;

    %macro x;
    /*sort for the distinct branch number*/
    proc sort data=fulldata out=temptable nodupkey;
    by branchid;
    run;

    %let branchcount=0;

    /*save the banch number, branchid and branch count into macro variables */
    data _null_;
    set temptable end=x;
    call symput("branch" || strip(_n_),strip(branchid));
    if x then call symput('branchcount',_n_);
    run;

    /* cycle in the branch count and split the table by brancid */
    %let i = 1;
    %do %while (&i<=&branchcount);
        data branch&i;
        set fulldata;
        where branchid=&&branch&i;
        run;
        proc report data=branch&i; /* you can modify if you want */
        quit;
        %let i=%eval(&i.+1);
    %end;
    %mend;
    %x;

    /* it make 3 table and the third has 2 rows */
    /*important : the branchid is numeric you need use like : 
     where branchid="&&branch&i";*/

Upvotes: 0

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

If you want to create separate datasets by branch, you can use a macro to do so. The below macro will get the distinct number of branches and subset the data into individual files suffixed 1, 2, 3, etc.

You will need to know the distinct number of branches. If your dataset is large, this will take some time to complete. You can run these all in parallel to make it run faster, but the code will increase in complexity.

%macro splitData(group=, data=, out=);

    proc sql noprint;
        select distinct &group.
        into :groupvalues separated by '|'
        from &data.
        ;
    quit;

    %do i = 1 %to %sysfunc(countw(&groupvalues., |));
        %let groupvalue = %scan(&groupvalues., &i., |);

        data &out._&i.;
            set &data.;
            where &group. = "&groupvalue.";
        run;
    %end;

%mend;
%splitData(data=sashelp.cars, group=origin, out=want);

Upvotes: 1

Eyal Marom
Eyal Marom

Reputation: 301

Using PROC PRINT with BY statement as such:

PROC PRINT DATA=have ;
BY Branch ;
RUN ;

Upvotes: 0

Related Questions