Reputation: 321
I am trying to join two tables, weight 1 and weight2. I want to join these two tables on ID1=ID2 first. if there is a missing ID, I will join on DOB1=DOB2. I want the program to compare each row in the table. but sas saw the first ID is missing in weight2 table, then it completely switch to the else statement, and use DOB1=DOB2 to join.
My Ideal result is to have four rows joined. but now sas only gives me three rows.
data weight1;
input ID1 $ Week1 DOB1;
datalines;
1111 195 101
2222 220 102
3333 173 103
4444 135 104
;
proc print data=weight1;
run;
data weight2;
infile datalines missover;
input ID2 $ Week2 DOB2;
datalines;
195 101
2222 220 102
3333 173 103
4444 135 104
;
proc print data=weight2;
run;
options mlogic symbolgen mprint;
%macro test ;
proc sql;
create table final as
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1 a
%if %sysfunc(exist(b.ID2)) %then
inner join weight2 b
on a.ID1 = b.ID2;
%else
/*%if %sysfunc(not exist(b.IDnumber))*/
inner join weight2 b
on a.DOB1 = b.DOB2
;
;
quit;
%mend test;
%test
Sas log:
LOGIC(TEST): Beginning execution.
MPRINT(TEST): proc sql;
MLOGIC(TEST): %IF condition %sysfunc(exist(b.ID2)) is FALSE
MPRINT(TEST): create table final as select a.ID1, a.DOB1, b.ID2,b.DOB2 from weight1 a inner join weight2 b on a.DOB1 = b.DOB2 ;
NOTE: Table WORK.FINAL created, with 3 rows and 4 columns.
this is my ideal result
ID1 DOB1 ID2 DOB2
1111 101 101
2222 102 2222 102
3333 103 3333 103
4444 104 4444 104
Upvotes: 1
Views: 716
Reputation: 27516
A case
expression can be used as join criteria.
Do not use macro. Your use of %sysfunc(exist
is just wrong, the exist
function checks for the existence of a data set, and %sysfunc
is performed pre source code compilation time, not at SQL execution time.
Use a case expression to evaluate the pre-conditions for your equality check.
Example:
data weight1;
input ID1 $ Week1 DOB1;
datalines;
1111 195 101
2222 220 102
3333 173 103
4444 135 104
;
data weight2;
infile datalines missover;
input ID2 $ Week2 DOB2;
datalines;
. 195 101
2222 220 102
3333 173 103
4444 135 104
;
proc sql;
create table want as
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1 a
join weight2 b
on case
when (a.id1 is not missing and b.id2 is not missing) then a.id1=b.id2
when (a.dob1 is not missing and b.dob2 is not missing) then a.dob1=b.dob2
else 0
end
;
NOTE: The case of real data where you have more than one row in each table with missing id and the same date of birth will cause a multiplicative row count effect in the result table. Such as this sample data:
1111 195 101
2222 220 102
3333 173 103
4444 135 104
. 145 105 ***
. 175 105 ***
and
. 195 101
2222 220 102
3333 173 103
4444 135 104
. 155 105 ***
. 166 105 ***
4 = 2 x 2 result rows from the starred data.
Upvotes: 1
Reputation: 217
If you trusted the join on DOBs as much as the one on IDs, you could simply do on a.ID1 = b.ID2 or a.DOB1 = b.DOB2
. But I assume the DOB-match could give some wrong results when IDs aren't missing.
You could first select all rows where you have both sets of IDs, and then select all rows where DOBs match and either ID is missing. Use a union
to get them into the same table. I can't test my code right now, but something like this:
proc sql;
create table final as
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1(where=(not missing(ID1)) a
inner join weight2(where=(not missing(ID2)) b
on a.ID1 = b.ID2
union
select a.ID1, a.DOB1, b.ID2,b.DOB2
from weight1 a
inner join weight2 b
on a.DOB1 = b.DOB2 and (missing(ID1) or missing(ID2))
;
Upvotes: 1