Dileep
Dileep

Reputation: 11

How do I loop through table records in SAS?

I am new to SAS and trying to solve the following scenario:

Scenario: A loan (123) has more than 2 transactions on a given transaction date. And, each record has a columns: ID, Trans_Amt, LoanNo, etc.

I was asked to create a column to show if two different records with same loan# and ID adds up to 0 for Trans_Amt then those two records should be shown as 'Y'

data have;
    input ID$ Loan_No Trans_Am;
    datalines;
A1 123 -10
B1 123 20
A1 123 10
A1 123 20
;
run;

Output should be:

data want;
    input ID$ Loan_No Trans_Am Exception$;
    datalines;
A1 123 -10 Y
B1 123 20 N
A1 123 10 Y
A1 123 20 N
;
run;

Thanks in advance for your help

My thought process is to create loops for ID, LoanNo and iterate through them but unsure as to how to do it in SAS.

Upvotes: 1

Views: 1384

Answers (4)

Jeremy Miller
Jeremy Miller

Reputation: 146

Assuming that you actually want to identify cases where the total transactions for a specific ID/loan_no add to zero:

** Get sum of daily transactions by ID/loan_no **;
proc summary data=have nway;
  class id loan_no;
  var trans_amt;
  output out=temp_sum sum=daily_total;
run;

proc sort data=have;
  by id loan_no;

** Merge full list with daily totals, and assign as 0=Y, other=N **;
data want (drop=daily_total);
  merge have (in=in1)
        temp_sum (in=in2 keep=id loan_no daily_total)
        ;
  by id loan_no;

  ** Just a check, can remove this **;
  if ^(in1 & in2) then abort;

  if daily_total = 0 then exception = 'Y';
  else exception = 'N';
run;

This is a fairly simple method that hopefully is easy to understand.

Upvotes: 0

Dirk Horsten
Dirk Horsten

Reputation: 3845

Using the same data as Richard, but without temporary variables:

data have;
  input ID $ Loan_No Trans_Amt;
datalines;
A1 123 -10
B1 123 20
A1 123 10
A1 123 20
Z  1   -10
Z  1   2
Z  1   2
Z  1   2
Z  1   2
P  7   -40
Z  1   1
Z  1   1
P  7   60
A1 123 -100
A1 123 -20
A1 123 50
A1 123 50
;

I define an in memory hash table to store running totals per ID.

data want;
    set have end=last;
    if _N_ eq 1 then do;
        dcl hash hash_amt();
        hash_amt.defineKey('ID');
        hash_amt.defineData('running_amt');
        hash_amt.defineDone();
    end;

For each observation (=row), look if the ID is already known. (The return code of find() is zero if the key, i.e. the ID, is found.)

    rcFind = hash_amt.find();

If not found: add it; if found: update it.

    if rcFind then do;
        running_amt = trans_amt;
        rcAdd = hash_amt.add();
    end;
    else do;
        running_amt = running_amt + trans_amt;
        rcReplace = hash_amt.replace();
    end;

Take your conclusions (Every non zero number is interpreted as True and zero as False in SAS)

    Exception = ifc(running_amt, 'N', 'Y');

Drop temporary variables (colon is wildcard)

    drop rc: running_amt;
run;

REMARK: This solution is by far the fastest, as it does not sort any data; It even does not write any intermediate data to disk.

Upvotes: 0

Tom
Tom

Reputation: 51566

Split the data into POSITIVE and NEGATIVE amounts. Order then and number them so you can then merge them.

data have;
  input ID $ Loan_No Trans_Amt;
  sign=sign(trans_amt);
  abs=abs(trans_amt);
datalines;
A1 123 -10
B1 123 20
A1 123 10
A1 123 20
Z  1   -10
Z  1   2
Z  1   2
Z  1   2
Z  1   2
P  7   -40
Z  1   1
Z  1   1
P  7   60
A1 123 -100
A1 123 -20
A1 123 50
A1 123 50
;

proc sort;
  by loan_no sign abs;
run;

data positive(rename=(id=p_id)) negative(rename=(id=n_id));
  set have;
  by loan_no sign abs;
  subrow+1;
  if first.abs then subrow=1;
  if sign=1 then output positive; 
  else output negative;
run;

data want;
  retain Loan_No match ID abs ;
  merge positive(in=in1) negative(in=in2);
  by loan_no abs subrow;
  if in1 and in2 then do;
     matchno+1;
     match=matchno;
     id=p_id;
     output;
     id=n_id;
     abs=-abs;
     output;
  end;
  else do ;
    match=0;
    if in1 then id=p_id;
    else id=n_id;
    output;
  end;
  keep id loan_no abs match;
  rename abs=Trans_Amt;
run;

proc sort;
  by loan_no match id ;
run;
proc print;
run;

Results:

                                 Trans_
Obs    Loan_No    match    ID      Amt

  1        1        0      Z         1
  2        1        0      Z         1
  3        1        0      Z         2
  4        1        0      Z         2
  5        1        0      Z         2
  6        1        0      Z         2
  7        1        0      Z        10
  8        7        0      P        40
  9        7        0      P        60
 10      123        0      A1       20
 11      123        0      A1       50
 12      123        0      A1       50
 13      123        0      A1      100
 14      123        1      A1       10
 15      123        1      A1      -10
 16      123        2      A1      -20
 17      123        2      B1       20

Upvotes: 0

Richard
Richard

Reputation: 27498

A SET statement is used to loop through a data set.

What if you have transaction amounts -10, 6, 4? Would three records by flagged? Do they have to be 'in-order' to sum to zero? -10, 20, 10 has one pair of record that sum to zero but they are not in order.

You can add an ordering variable, then sort and by loan id and do your sum rules.

Example:

Tell boss very bad idea to have same loan # for different id. rownum is temporarily added to data set to maintain original order so rule can be performed. Things can become way complicated if someone overpays or takes out more against loan_num or does smaller partial payments.

Double DOW loops coded to compute metric for group of contiguous rows within larger group.

data have;
    input ID$ Loan_No Trans_Amt;
    datalines;
A1 123 -10
B1 123 20
A1 123 10
A1 123 20
Z  1   -10
Z  1   2
Z  1   2
Z  1   2
Z  1   2
P  7   -40
Z  1   1
Z  1   1
P  7   60
A1 123 -100
A1 123 -20
A1 123 50
A1 123 50
;

data have2;
  set have;
  rownum + 1;
run;

proc sort data=have2;
  by id loan_no rownum;
run;

data want;
  do _n_ = 1 by 1 until (last.id or total=0);
    set have2;
    by id loan_no;
    total = sum(total, trans_amt);
  end;
  do _n_ = 1 to _n_;
    set have2;
    if total = 0 then flag = 'Y'; else flag = 'N';
    output;
  end;
run;

proc sort data=want out=want(drop=rownum total);
  by rownum;
run;

Upvotes: 0

Related Questions