user2518751
user2518751

Reputation: 735

Add rows to a current key using previous rows

I know title is kind of hard to understand, I'm sorry for that. Couldn't find a better way to explain my problem.

I have something like this:

 DATA HAVE;
 INPUT POLID PRODID TOSNUM PREMID X_INSURER_SERIAL_NO;
 DATALINES;
13102100206 310 0 20 .
13102100207 310 0 20 .
13102100207 310 0 21 .
13102100207 310 1 1 .
13102100207 310 1 2 .
13102100207 310 1 3 .
13102100207 310 1 20 .
13102100207 310 1 21 .
13102100207 310 2 1 .
13102100207 310 2 2 .
13102100207 310 2 20 .
13102100207 310 2 21 .
13102100207 310 3 1 .
13102100207 310 3 2 .
13102100207 310 3 9 .
13102100207 310 3 20 .
13102100207 310 3 21 .
13102100209 310 500 20 5
13102100210 310 100 20 1
13102100210 310 101 21 1
13102100210 310 100 1 1
13102100210 310 101 2 1
13102100210 310 101 3 1
13102100211 310 100 20 1
13102100211 310 101 21 1
13102100211 310 200 1 2
13102100211 310 201 2 2
13102100211 310 201 3 2
;RUN;

What I need to do is,

add premid if missing in current row but existed in previous row for every polid+prodid.

For example, polid = 13102100207, prodid = 310.

when tosnum = 1, I have 5 different premid (1,2,3,20,21). when tosnum = 2, I have 4 different premid (1,2,20,21). premid = 3 is missing.

I need to add this premid, and add 1 in fictive_ind column whenever I add a premid.

polid = 13102100206 , for example, has only one row , and therfor, sholud be exluded from the calculation, as it has no previous tosnum.

when X_INSURER_SERIAL_NO is not missing, it should replace tosnum.

so, for example, polid = 13102100210 sholud be exluded as well from the calculation, as it doesn't have any other previous value (x_insurer_serial_no is always equal to 1).

This is What I did:

First of all, replaced tosnum when x_insurer_serial_no is not missing:

PROC SQL;
CREATE TABLE AAA
AS
SELECT DISTINCT POLID, PRODID , 
CASE WHEN X_INSURER_SERIAL_NO = . THEN TOSNUM ELSE X_INSURER_SERIAL_NO END as  X_INSURER_SERIAL_NO 
FROM HAVE 
;QUIT;

Then,

I use the below code, to find out if polid sholud be included or exluded from the calculation:

PROC SQL;
CREATE TABLE BBB AS
SELECT POLID, PRODID, COUNT(*) AS COUNT
FROM AAA
GROUP BY 
POLID, PRODID
;QUIT;

I only take for the calculation rows where count>1.

PROC SQL;
CREATE TABLE ABC AS
SELECT * FROM HAVE 
WHERE 
CATS(POLID,PRODID) IN 
SELECT CATS(POLID,PRODID) FROM BBB
WHERE COUNT>1;
QUIT;

Now i find min and max values for looping:

proc sql;
select min(TOSNUM), max(TOSNUM) ,  max(TOSNUM) -  min(TOSNUM) into: min_TOSNUM, :max_TOSNUM, :diff
from abc;
quit;

and then loop:

%macro fictive_premid (min_TOSNUM = , diff = );

proc sql; create table fictive_premid as select POLID, PRODID, TOSNUM, PREMID, 0 as FICTIVE_IND from abc where TOSNUM = &min_TOSNUM; quit;

%do i =  &min_TOSNUM %to %eval(&diff-1) ;
    proc sql;
    create table fictive_premid_t as
    select * from 
    (
    select polid, prodid, %eval(&i+1) as TOSNUM, premid, 1 as FICTIVE_IND
    from fictive_premid
    where TOSNUM = &i
    except
    select polid, prodid, %eval(&i+1) as TOSNUM , premid,  1 as FICTIVE_IND
    from abc
    where TOSNUM = %eval(&i+1)
    )
    union 
    select polid, prodid,%eval(&i+1) as TOSNUM, premid, 0 as FICTIVE_IND
    from abc
    where TOSNUM = %eval(&i+1);
    quit;

    proc append base=fictive_premid data=fictive_premid_t; run;

%end;


by polid prodid TOSNUM  premid
%mend;


%fictive_premid (min_TOSNUM = &min_TOSNUM, diff = &diff );

Now,

my problem begings, for example, when value of &i = 4 and polid = 13102100207....

as last tosnum equals to 3, I get 6 new rows with tosnum = 4, and fictive_ind = 1 in all of them. That occurrs, of course until &i reaches the value of &diff.

What I am asking you, please,

Is a way to stop itterating when reaches the last tosnum or x_insurer_serial_no.

Your help is very much appreciated.

Hopefuly I explaind my needs good enough.

Upvotes: 0

Views: 117

Answers (1)

DomPazz
DomPazz

Reputation: 12465

If I understand your question, I think this does what you want. This relies on the TOSNUM values to be incremental (i.e. 1, 2, 3 or 4, 5, 6) with no skips.

/*Overwrite TOSNUM with X_INSURER_SERIAL_NO
  if it exists*/
data have;
set have;
tosnum = coalesce(X_INSURER_SERIAL_NO,TOSNUM);
run;

proc sort data=have;
by POLID PRODID TOSNUM PREMID;
run;

proc sql noprint;
/*Find polid values to delete*/
create view dist as
select distinct polid,
       prodid,
       tosnum
    from have;

create table toDelete as
    select polid,
           count(*) as count
        from dist
        group by polid
        having count = 1;

/*Delete single values*/
delete from have
    where polid in
        ( select polid from toDelete );

create table lastTosnum as
select POLID, PRODID, max(TOSNUM)+1 as Tosnum
    from have
    group by POLID, PRODID;
quit;

/*Increment TOSNUM by 1*/
data nextVals;
set have(drop=X_INSURER_SERIAL_NO );
TOSNUM = tosnum + 1;
run;

/*Delete the last TOSNUM from the set above*/
proc sql noprint;
delete from nextVals as nv
    where nv.tosnum in
        (select tosnum
            from lastTosnum as lts
            where lts.polid = nv.polid
              and lts.prodid = nv.prodid
        );
quit;

/*Merge to get the final set.*/
data want;
merge have (in=h) nextVals (in=next);
by POLID PRODID TOSNUM PREMID;

if next and ^h then
    fictive_ind = 1;
else 
    fictive_ind = 0;
/*decrement tosnum*/
tosnum = tosnum-1;

run;

Upvotes: 1

Related Questions