Reputation: 433
I am confused by the code below, however it does run through and returns output with acct_nb and CashAmount. If b.acct_nb is null, how can we do the left join a.acct_nb=b.acct_nb? Why instead of all null account stay, the output data NewData has accounts with acct_nb. It is not my code and I cannot understand what it is doing.
proc sql;
create table NewData as select
a.acct_nb, a.CashAmount
from dataA as a left join dataB as b
on a.acct_nb=b.acct_nb
where b.acct_nb is null order by acct_nb;
quit;
Upvotes: 1
Views: 10980
Reputation: 12465
The key is that this is a LEFT JOIN. That means all rows from table A, and any rows that match from B.
After that, it applies the WHERE condition b.acct_nb is null
.
So there are 2 cases where b.acct_nb is null
.
Here is a simple example:
data dataA;
input acct_nb cashamount;
datalines;
1 1
2 3
. 4
5 .
;
data dataB;
input acct_nb;
datalines;
1
.
5
;
proc sql;
select
a.acct_nb, a.CashAmount
from dataA as a left join dataB as b
on a.acct_nb=b.acct_nb
where b.acct_nb is null
order by acct_nb;
quit;
This returns
acct_nb cashamount
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
. 4
2 3
The first record is there because a.acct_nb is missing and matches with a record in b.
The second record is there because acct_nb=2 does not exist in dataB.
If acct_nb is never null, then this is a way to find records in dataA that are NOT in dataB.
Upvotes: 3