Jonsi Billups
Jonsi Billups

Reputation: 153

sas macro loops using index numbers

I am very keen to learn whether I can handle or not such situations in SAS Base without using SAS IML;

Let's say I have the vector have

            a    b     c     d    e        f
           1001 JPN 10,000  50%  JPN    2,000
           1001 EUR 12,648  100% EUR    3,000
           1001 USD 15,997  50%  USD    5,000
           1001 JPN 20,233  20%  JPN    8,000
           1001 EUR 25,591  20%  EUR    9,000
           1001 USD 32,368  50%  USD    4,000
           1002 JPN 28,393  50%  JPN    6,000
           1002 EUR 24,906  100% EUR    4,000
           1002 USD 21,847  50%  USD    8,000
           1002 TRY 19,164  20%  JPN    6,000
           1002 EUR 16,811  50%  EUR    15,000
           1002 USD 14,746  100% USD    52,000
           1003 USD 10,000  50%  XVN    8,000



 %macro;
 % let i = 1;
 data want;
 set have;
 %do %while a[&i]=a[eval(&i+1)] ;
 b = &i;
 &i=eval(&i+1);
 %end
 %mend

What I would like to do is for a with b=e to take the difference of max(c) and max(f) and multiply this difference with d and then for each distinct a to sum these outcomes. This will be iterative. The table I compose here just a small representation of the case.

Thanks

Upvotes: 0

Views: 510

Answers (3)

Stu Sztukowski
Stu Sztukowski

Reputation: 12849

Thank you for posting! I am assuming you want the max value when b = e, but I will give you two possible solutions just in case. Note that you only need to use proc sql once during any SQL calculation groups, but for clarity's sake we'll just do it twice.

Step 1: Get max values of c and f for when b = e

proc sql noprint;
    create table maxes as
        select b, e, 
               max(c) as max_c, max(f) as max_f
        from have
        where upcase(b) = upcase(e)
        group by b, e
    ;
quit;

Step 2: Get the max values of when b = e into the table, sum up all the cases by a where b = e

proc sql noprint;
    create table want as
        select a, sum(result) as result

        /* Get max values into the table. 
           Only interested in cases where a = b */
        from(select st1.a, st1.b, st1.e, 
                    max_c, max_f, 
                    (max_c - max_f)*st1.d as result
             from have as st1
             INNER JOIN
                  maxes as st2
             ON upcase(st1.b) = upcase(st2.b)
                AND upcase(st1.e) = upcase(st2.e)
             where upcase(st1.b) = upcase(st1.e) )
        group by a
    ;
quit;

Now, if you want to do this for the table maximum of c and f, you can do it with macro variables, the table sorted by a, by-group processing, and a Sum Statement:

Step 1: Read maximum values of c and f into macro variables

proc sql noprint;
    select max(c), max(f)
    into :max_c, :max_f
    from have;
quit;

Step 2: Sum the result for each a

data want;
    set have;
    by a;
    where upcase(b) = upcase(e);

    result+( (&max_c - &max_f)*d );

    if(last.a) then do;
        output;
        call missing(result); *Reset the sum for the next a group;
    end;

    keep a result;
run;

I hope this helps! This is just one way you could go about solving the problem, but there are many other great ways. It all depends upon your goals, environment, and programming style.

Upvotes: 1

Tom
Tom

Reputation: 51566

It sounds like what you mean by this phrase "for a with b=e" is that you want to process BY A the records WHERE B=E.
So in SAS if you want to filter the records to use you would use a WHERE statement or option or an IF statement. To process groups of records with same value of a set of variables you would use a BY statement (or GROUP BY in PROC SQL code). So here is your example dataset.

data have ;
  length a 8 b $3 c d 8 e $3 f 8;
  informat c f comma32. d percent. ;
  input a b c d e f ;
cards;
1001 JPN 10,000  50%  JPN    2,000
1001 EUR 12,648  100% EUR    3,000
1001 USD 15,997  50%  USD    5,000
1001 JPN 20,233  20%  JPN    8,000
1001 EUR 25,591  20%  EUR    9,000
1001 USD 32,368  50%  USD    4,000
1002 JPN 28,393  50%  JPN    6,000
1002 EUR 24,906  100% EUR    4,000
1002 USD 21,847  50%  USD    8,000
1002 TRY 19,164  20%  JPN    6,000
1002 EUR 16,811  50%  EUR    15,000
1002 USD 14,746  100% USD    52,000
1003 USD 10,000  50%  XVN    8,000
run;

First you can find the max of the C and F variables using PROC SUMMARY.

proc summary data=have nway ;
  by a ;
  where b=e ;
  var c f ;
  output out=summary max= / autoname ;
run;

Then you can merge back with the detail records to apply calculate the difference and multiply by the percentage.

data middle ;
  merge have summary ;
  by a ;
  if b=e ;
  diff = c_max - f_max ;
  product = diff * d ;
run;

Then you can use PROC SUMMARY again to find the sum within each group.

proc summary data=middle nway ;
  by a ;
  var product ;
  output out=want sum=new_var ;
run;

Totals

Obs      a     _TYPE_    _FREQ_     new_var
1     1001       0         6       67767.2
2     1002       0         5      -82624.5

Upvotes: 1

Tom
Tom

Reputation: 51566

So a basic data step or a proc call is already a looping over the input data. If you want to use a dataset as the source for parameter values for a macro call then a simple driver program is what you need.

Let's assume you have already defined a macro that performs some complex series of data steps and proc steps and takes one or more input parameters. Then you can create a dataset of parameter values and in a simple data step use CALL EXECUTE to generate calls to the macro passing the parameter values. So if the macro %MYMACRO() is defined with 5 positional parameters and you have the data set HAVE with five variables PARM1 to PARM5 and 10 observations then this data step will generate 10 macro calls.

 data _null_;
    set have;
    call execute(cats('%nrstr(%mymacro)(',catx(',',of parm1-parm5),')'));
 run;

Upvotes: 0

Related Questions