user1481397
user1481397

Reputation: 433

SAS, where condition is null

I am confused by the code below, however it does run through and returns output with acct_nb and CashAmount. If b.acct_nb is null, how can we do the left join a.acct_nb=b.acct_nb? Why instead of all null account stay, the output data NewData has accounts with acct_nb. It is not my code and I cannot understand what it is doing.

 proc sql;
    create table NewData as select
    a.acct_nb, a.CashAmount
    from dataA as a left join dataB as b 
    on a.acct_nb=b.acct_nb 
    where b.acct_nb is null order by acct_nb;
    quit;

Upvotes: 1

Views: 10980

Answers (1)

DomPazz
DomPazz

Reputation: 12465

The key is that this is a LEFT JOIN. That means all rows from table A, and any rows that match from B.

http://support.sas.com/documentation/cdl/en/sqlproc/69049/HTML/default/viewer.htm#p0o4a5ac71mcchn1kc1zhxdnm139.htm

After that, it applies the WHERE condition b.acct_nb is null.

So there are 2 cases where b.acct_nb is null.

  1. If dataB has a record with a null value in acct_nb.
  2. If the acct_nb in dataA does not have a corresponding record in dataB.

Here is a simple example:

data dataA;
input acct_nb cashamount;
datalines;
1 1
2 3
. 4
5 .
;

data dataB;
input acct_nb;
datalines;
1
.
5
;

proc sql;
select
    a.acct_nb, a.CashAmount
  from dataA as a left join dataB as b 
    on a.acct_nb=b.acct_nb 
  where b.acct_nb is null 
  order by acct_nb;
quit;

This returns

                              acct_nb  cashamount
                             ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
                                    .           4
                                    2           3

The first record is there because a.acct_nb is missing and matches with a record in b.

The second record is there because acct_nb=2 does not exist in dataB.

If acct_nb is never null, then this is a way to find records in dataA that are NOT in dataB.

Upvotes: 3

Related Questions