Reputation: 735
I know title is kind of hard to understand, I'm sorry for that. Couldn't find a better way to explain my problem.
I have something like this:
DATA HAVE;
INPUT POLID PRODID TOSNUM PREMID X_INSURER_SERIAL_NO;
DATALINES;
13102100206 310 0 20 .
13102100207 310 0 20 .
13102100207 310 0 21 .
13102100207 310 1 1 .
13102100207 310 1 2 .
13102100207 310 1 3 .
13102100207 310 1 20 .
13102100207 310 1 21 .
13102100207 310 2 1 .
13102100207 310 2 2 .
13102100207 310 2 20 .
13102100207 310 2 21 .
13102100207 310 3 1 .
13102100207 310 3 2 .
13102100207 310 3 9 .
13102100207 310 3 20 .
13102100207 310 3 21 .
13102100209 310 500 20 5
13102100210 310 100 20 1
13102100210 310 101 21 1
13102100210 310 100 1 1
13102100210 310 101 2 1
13102100210 310 101 3 1
13102100211 310 100 20 1
13102100211 310 101 21 1
13102100211 310 200 1 2
13102100211 310 201 2 2
13102100211 310 201 3 2
;RUN;
What I need to do is,
add premid if missing in current row but existed in previous row for every polid+prodid.
For example, polid = 13102100207, prodid = 310.
when tosnum = 1, I have 5 different premid (1,2,3,20,21). when tosnum = 2, I have 4 different premid (1,2,20,21). premid = 3 is missing.
I need to add this premid, and add 1 in fictive_ind column whenever I add a premid.
polid = 13102100206 , for example, has only one row , and therfor, sholud be exluded from the calculation, as it has no previous tosnum.
when X_INSURER_SERIAL_NO is not missing, it should replace tosnum.
so, for example, polid = 13102100210 sholud be exluded as well from the calculation, as it doesn't have any other previous value (x_insurer_serial_no is always equal to 1).
This is What I did:
First of all, replaced tosnum when x_insurer_serial_no is not missing:
PROC SQL;
CREATE TABLE AAA
AS
SELECT DISTINCT POLID, PRODID ,
CASE WHEN X_INSURER_SERIAL_NO = . THEN TOSNUM ELSE X_INSURER_SERIAL_NO END as X_INSURER_SERIAL_NO
FROM HAVE
;QUIT;
Then,
I use the below code, to find out if polid sholud be included or exluded from the calculation:
PROC SQL;
CREATE TABLE BBB AS
SELECT POLID, PRODID, COUNT(*) AS COUNT
FROM AAA
GROUP BY
POLID, PRODID
;QUIT;
I only take for the calculation rows where count>1.
PROC SQL;
CREATE TABLE ABC AS
SELECT * FROM HAVE
WHERE
CATS(POLID,PRODID) IN
SELECT CATS(POLID,PRODID) FROM BBB
WHERE COUNT>1;
QUIT;
Now i find min and max values for looping:
proc sql;
select min(TOSNUM), max(TOSNUM) , max(TOSNUM) - min(TOSNUM) into: min_TOSNUM, :max_TOSNUM, :diff
from abc;
quit;
and then loop:
%macro fictive_premid (min_TOSNUM = , diff = );
proc sql; create table fictive_premid as select POLID, PRODID, TOSNUM, PREMID, 0 as FICTIVE_IND from abc where TOSNUM = &min_TOSNUM; quit;
%do i = &min_TOSNUM %to %eval(&diff-1) ;
proc sql;
create table fictive_premid_t as
select * from
(
select polid, prodid, %eval(&i+1) as TOSNUM, premid, 1 as FICTIVE_IND
from fictive_premid
where TOSNUM = &i
except
select polid, prodid, %eval(&i+1) as TOSNUM , premid, 1 as FICTIVE_IND
from abc
where TOSNUM = %eval(&i+1)
)
union
select polid, prodid,%eval(&i+1) as TOSNUM, premid, 0 as FICTIVE_IND
from abc
where TOSNUM = %eval(&i+1);
quit;
proc append base=fictive_premid data=fictive_premid_t; run;
%end;
by polid prodid TOSNUM premid
%mend;
%fictive_premid (min_TOSNUM = &min_TOSNUM, diff = &diff );
Now,
my problem begings, for example, when value of &i = 4 and polid = 13102100207....
as last tosnum equals to 3, I get 6 new rows with tosnum = 4, and fictive_ind = 1 in all of them. That occurrs, of course until &i reaches the value of &diff.
What I am asking you, please,
Is a way to stop itterating when reaches the last tosnum or x_insurer_serial_no.
Your help is very much appreciated.
Hopefuly I explaind my needs good enough.
Upvotes: 0
Views: 117
Reputation: 12465
If I understand your question, I think this does what you want. This relies on the TOSNUM values to be incremental (i.e. 1, 2, 3 or 4, 5, 6) with no skips.
/*Overwrite TOSNUM with X_INSURER_SERIAL_NO
if it exists*/
data have;
set have;
tosnum = coalesce(X_INSURER_SERIAL_NO,TOSNUM);
run;
proc sort data=have;
by POLID PRODID TOSNUM PREMID;
run;
proc sql noprint;
/*Find polid values to delete*/
create view dist as
select distinct polid,
prodid,
tosnum
from have;
create table toDelete as
select polid,
count(*) as count
from dist
group by polid
having count = 1;
/*Delete single values*/
delete from have
where polid in
( select polid from toDelete );
create table lastTosnum as
select POLID, PRODID, max(TOSNUM)+1 as Tosnum
from have
group by POLID, PRODID;
quit;
/*Increment TOSNUM by 1*/
data nextVals;
set have(drop=X_INSURER_SERIAL_NO );
TOSNUM = tosnum + 1;
run;
/*Delete the last TOSNUM from the set above*/
proc sql noprint;
delete from nextVals as nv
where nv.tosnum in
(select tosnum
from lastTosnum as lts
where lts.polid = nv.polid
and lts.prodid = nv.prodid
);
quit;
/*Merge to get the final set.*/
data want;
merge have (in=h) nextVals (in=next);
by POLID PRODID TOSNUM PREMID;
if next and ^h then
fictive_ind = 1;
else
fictive_ind = 0;
/*decrement tosnum*/
tosnum = tosnum-1;
run;
Upvotes: 1