Reputation: 153
I am very keen to learn whether I can handle or not such situations in SAS Base without using SAS IML;
Let's say I have the vector have
a b c d e f
1001 JPN 10,000 50% JPN 2,000
1001 EUR 12,648 100% EUR 3,000
1001 USD 15,997 50% USD 5,000
1001 JPN 20,233 20% JPN 8,000
1001 EUR 25,591 20% EUR 9,000
1001 USD 32,368 50% USD 4,000
1002 JPN 28,393 50% JPN 6,000
1002 EUR 24,906 100% EUR 4,000
1002 USD 21,847 50% USD 8,000
1002 TRY 19,164 20% JPN 6,000
1002 EUR 16,811 50% EUR 15,000
1002 USD 14,746 100% USD 52,000
1003 USD 10,000 50% XVN 8,000
%macro;
% let i = 1;
data want;
set have;
%do %while a[&i]=a[eval(&i+1)] ;
b = &i;
&i=eval(&i+1);
%end
%mend
What I would like to do is for a with b=e to take the difference of max(c) and max(f) and multiply this difference with d and then for each distinct a to sum these outcomes. This will be iterative. The table I compose here just a small representation of the case.
Thanks
Upvotes: 0
Views: 510
Reputation: 12849
Thank you for posting! I am assuming you want the max value when b = e, but I will give you two possible solutions just in case. Note that you only need to use proc sql
once during any SQL calculation groups, but for clarity's sake we'll just do it twice.
Step 1: Get max values of c and f for when b = e
proc sql noprint;
create table maxes as
select b, e,
max(c) as max_c, max(f) as max_f
from have
where upcase(b) = upcase(e)
group by b, e
;
quit;
Step 2: Get the max values of when b = e into the table, sum up all the cases by a where b = e
proc sql noprint;
create table want as
select a, sum(result) as result
/* Get max values into the table.
Only interested in cases where a = b */
from(select st1.a, st1.b, st1.e,
max_c, max_f,
(max_c - max_f)*st1.d as result
from have as st1
INNER JOIN
maxes as st2
ON upcase(st1.b) = upcase(st2.b)
AND upcase(st1.e) = upcase(st2.e)
where upcase(st1.b) = upcase(st1.e) )
group by a
;
quit;
Now, if you want to do this for the table maximum of c and f, you can do it with macro variables, the table sorted by a, by-group processing, and a Sum Statement:
Step 1: Read maximum values of c and f into macro variables
proc sql noprint;
select max(c), max(f)
into :max_c, :max_f
from have;
quit;
Step 2: Sum the result for each a
data want;
set have;
by a;
where upcase(b) = upcase(e);
result+( (&max_c - &max_f)*d );
if(last.a) then do;
output;
call missing(result); *Reset the sum for the next a group;
end;
keep a result;
run;
I hope this helps! This is just one way you could go about solving the problem, but there are many other great ways. It all depends upon your goals, environment, and programming style.
Upvotes: 1
Reputation: 51566
It sounds like what you mean by this phrase "for a with b=e" is that you want to process BY A the records WHERE B=E.
So in SAS if you want to filter the records to use you would use a WHERE statement or option or an IF statement. To process groups of records with same value of a set of variables you would use a BY statement (or GROUP BY in PROC SQL code). So here is your example dataset.
data have ;
length a 8 b $3 c d 8 e $3 f 8;
informat c f comma32. d percent. ;
input a b c d e f ;
cards;
1001 JPN 10,000 50% JPN 2,000
1001 EUR 12,648 100% EUR 3,000
1001 USD 15,997 50% USD 5,000
1001 JPN 20,233 20% JPN 8,000
1001 EUR 25,591 20% EUR 9,000
1001 USD 32,368 50% USD 4,000
1002 JPN 28,393 50% JPN 6,000
1002 EUR 24,906 100% EUR 4,000
1002 USD 21,847 50% USD 8,000
1002 TRY 19,164 20% JPN 6,000
1002 EUR 16,811 50% EUR 15,000
1002 USD 14,746 100% USD 52,000
1003 USD 10,000 50% XVN 8,000
run;
First you can find the max of the C and F variables using PROC SUMMARY.
proc summary data=have nway ;
by a ;
where b=e ;
var c f ;
output out=summary max= / autoname ;
run;
Then you can merge back with the detail records to apply calculate the difference and multiply by the percentage.
data middle ;
merge have summary ;
by a ;
if b=e ;
diff = c_max - f_max ;
product = diff * d ;
run;
Then you can use PROC SUMMARY again to find the sum within each group.
proc summary data=middle nway ;
by a ;
var product ;
output out=want sum=new_var ;
run;
Totals
Obs a _TYPE_ _FREQ_ new_var
1 1001 0 6 67767.2
2 1002 0 5 -82624.5
Upvotes: 1
Reputation: 51566
So a basic data step or a proc call is already a looping over the input data. If you want to use a dataset as the source for parameter values for a macro call then a simple driver program is what you need.
Let's assume you have already defined a macro that performs some complex series of data steps and proc steps and takes one or more input parameters. Then you can create a dataset of parameter values and in a simple data step use CALL EXECUTE to generate calls to the macro passing the parameter values. So if the macro %MYMACRO() is defined with 5 positional parameters and you have the data set HAVE with five variables PARM1 to PARM5 and 10 observations then this data step will generate 10 macro calls.
data _null_;
set have;
call execute(cats('%nrstr(%mymacro)(',catx(',',of parm1-parm5),')'));
run;
Upvotes: 0