Vincent
Vincent

Reputation: 955

Exporting a dataset for each value of a variable

I have a medical dataset with patients coming from several hospitals. What I'd like to get is an Excel file with each sheet being dedicating to a particular hospital. I have a variable stating from which one each patient comes so all I have to do is subsetting and export for each value of this variable.

I wrote this macro to do so for one particular value :

%macro exp_subset(table,var,value,out);

    DATA &table._&var._&value;
        SET &table;
        where &var = &value;
    run;
    proc export
        data=&table._&var._&value
        dbms=EXCEL
        LABEL
        OUTFILE="<path>\&out..xls"
        REPLACE;
        SHEET="&var._&value";
    run;
%mend;

And it works fine. However, I'm having trouble appliying it on all values of the variable. After searching how to do it, I tried with call execute() :

%macro exp_subsets_all(table,var,out);

    proc sql noprint;
        create table dist_var as    /*this comes from another post of SO :P */
        select distinct &var
        from &table
        ;
    quit;

    data _null_;
    set dist_var
        (rename=(&var=values));
        call execute ("%exp_subset(&table,&var,"||trim(left(values))||",&out)");
    run;
%mend;

But it doesn't work and I don't quite understand the error.

enter image description here

Again, there is no such error when using %exp_subset()alone... I'm very new with call execute() and I'm not sure what I can or cannot do. It seems to me I should be able to use it this way but am I wrong ?

Here is a table if you want to test (you'll have to specify a path in the first macro before):

DATA test;
input x y z;
cards;
1 2 3
4 2 6
7 8 9
7 11 8
;
run;

%exp_subset(test,x,7,table1);
%exp_subsets_all(test,y,table2);

Upvotes: 1

Views: 143

Answers (2)

Joe
Joe

Reputation: 63424

I have a few suggestions for this.

First, whenever constructing a macro call, use single quotes. This avoids the early resolution problem.

call execute('%exp_subset('...)

You can use double quotes after the very first bit, but the part that includes the % should include single quotes. You can also add %NRSTR if you prefer, but single quotes are simpler.


Second, the %exp_subsets_all macro could be much simpler.

%macro exp_subsets_all(table,var,out);
    proc sql noprint;
       select distinct cats('%exp_subset(',"&table,&var,",&var.,",&out)")
        into :calllist separated by ' '
        from &table
        ;
    quit;

    &calllist.
%mend;

It's probably not different performance wise, but it's, again, simpler. Only concern is that you need to be careful if you call this macro from a call execute - it must be included in %nrstr if so.


Third, the first macro should be one step:

%macro exp_subset(table,var,value,out);

    proc export
        data=&table.
        dbms=EXCEL
        LABEL
        OUTFILE="<path>\&out..xls"
        REPLACE;
        SHEET="&var._&value";
      where &var = &value;
    run;
%mend;

Upvotes: 3

mjsqu
mjsqu

Reputation: 5452

I'm sure there are easier ways to accomplish what you want to achieve, but your code is failing because you've made some syntax errors in the data _NULL_; datastep:

OLD

data _null_;
set dist_var;
    rename=(&var="values");
    call execute ("%exp_subset(&table,&var,"||values||",&out)");
run;

NEW

data _null_;
set dist_var
    (rename=(&var=values));
    mac=%NRSTR("%exp_subset(&table,&var,"||trim(left(values))||",&out)");
    call execute(mac);
run;

EDIT: Added NRSTR to prevent %exp_subset resolving early.

Changes

  • Rename changed to a dataset option, previously it was attempting to assign a new variable called 'rename' with the value (y="values"), which is a Boolean expression, evaluating to zero. You can see this by replacing _NULL_ with a dataset name and taking a look at the resulting dataset.

  • Noticed in the log that there were syntax errors relating to the formation of the dataset name when executing exp_subset with the call execute function, so added trim(left(values)) to the call execute, which removes whitespace around the values of values and allows the dataset name to be formed correctly.

I ended up with datasets called:

  • TEST_X_7
  • TEST_Y_11
  • TEST_Y_2
  • TEST_Y_8

I also used OPTIONS MPRINT; which is a useful statement to run for debugging macros, as it'll show you the code executed in the log:

MPRINT(EXP_SUBSETS_ALL):   DATA test_y_8;
MPRINT(EXP_SUBSETS_ALL):   SET test;
MPRINT(EXP_SUBSETS_ALL):   where y = 8;
MPRINT(EXP_SUBSETS_ALL):   run;
MPRINT(EXP_SUBSETS_ALL):   DATA test_y_11;
MPRINT(EXP_SUBSETS_ALL):   SET test;
MPRINT(EXP_SUBSETS_ALL):   where y = 11;
MPRINT(EXP_SUBSETS_ALL):   run;

Remember to turn it off if using on macros that are likely to generate a lot of statements as you will fill your log:

OPTIONS NOMPRINT;

Upvotes: 2

Related Questions