jcolebrand
jcolebrand

Reputation: 16025

SQL Join Ignore multiple matches (fuzzy results ok)

I don't even know what the name of my problem is called, so I'm just gonna put some sample data. I don't mind fuzzy results on this (this is the best way I can think to express it. I don't mind if I overlook some data, this is for approximated evaluation, not for detailed accounting, if that makes sense). But I do need every record in TABLE 1, and I would like to avoid the nulls case indicated below.

IS THIS POSSIBLE?

TABLE 1
acctnum sub fname  lname  phone
 12345   1   john   doe   xxx-xxx-xxxx
 12346   0   jane   doe   xxx-xxx-xxxx
 12347   0   rob    roy   xxx-xxx-xxxx
 12348   0   paul  smith  xxx-xxx-xxxx

TABLE 2
acctnum sub division
 12345   1   EAST
 12345   2   WEST
 12345   3   NORTH
 12346   1   TOP
 12346   2   BOTTOM
 12347   2   BALLOON
 12348   1   NORTH

So if we do a "regular outer" join, we'd get some results like this, since the sub 0's don't match the second table:

TABLE AFTER JOIN
acctnum sub fname  lname  phone         division
 12345   1   john   doe   xxx-xxx-xxxx   EAST
 12346   0   jane   doe   xxx-xxx-xxxx   null
 12347   0   rob    roy   xxx-xxx-xxxx   null
 12348   0   paul  smith  xxx-xxx-xxxx   null

But I would rather get

TABLE AFTER JOIN
acctnum sub fname  lname  phone         division
 12345   1   john   doe   xxx-xxx-xxxx   EAST
 12346   0   jane   doe   xxx-xxx-xxxx   TOP
 12347   0   rob    roy   xxx-xxx-xxxx   BALLOON
 12348   0   paul  smith  xxx-xxx-xxxx   NORTH

And I'm trying to avoid:

TABLE AFTER JOIN
acctnum sub fname  lname  phone         division
 12345   1   john   doe   xxx-xxx-xxxx   EAST
 12345   1   john   doe   xxx-xxx-xxxx   WEST
 12345   1   john   doe   xxx-xxx-xxxx   NORTH
 12346   0   jane   doe   xxx-xxx-xxxx   TOP
 12346   0   jane   doe   xxx-xxx-xxxx   BOTTOM
 12347   0   rob    roy   xxx-xxx-xxxx   BALOON
 12348   0   paul  smith  xxx-xxx-xxxx   NORTH

So I decided to go with using a union and two if conditions. I'll accept a null for conditions where the sub account is defined in table 1 but not in table 2, and for everything else, I'll just match against the min.

Upvotes: 4

Views: 2619

Answers (4)

Brock Adams
Brock Adams

Reputation: 93473

This will give your desired result, exactly (for the shown data):

Updated to not assume there is always a sub==1 value:

SELECT
    T1.acctnum,
    T1.sub,
    T1.fname,
    T1.lname,
    T1.phone,
    T2.division
FROM
    TABLE_1 T1
LEFT JOIN
    TABLE_2 T2  ON T1.acctnum = T2.acctnum
AND
    T2.sub  = (SELECT MIN(T3.sub) FROM TABLE_2 T3  WHERE T1.acctnum = T3.acctnum)
ORDER BY
    T1.lname,
    T1.fname,
    T1.acctnum

Upvotes: 1

Waleed A.K.
Waleed A.K.

Reputation: 1656

try to use

SELECT     MIN(Table_1.acctnum) as acctnum , MIN(Table_1.sub) as sub,MIN( Table_1.fname) as fname, MIN(Table_1.lname) as name, MIN(Table_1.phone) as phone, MIN(Table_2.division) as division 
FROM Table_1 INNER JOIN   Table_2 ON Table_1.acctnum = Table_2.acctnum AND Table_1.sub = Table_2.sub
where Table_1.sub>0
group by Table_1.acctnum 
union 
SELECT     MIN(Table_1.acctnum) as acctnum , MIN(Table_1.sub) as sub,MIN( Table_1.fname) as fname, MIN(Table_1.lname) as name, MIN(Table_1.phone) as phone, MIN(Table_2.division) as division 
FROM Table_1 INNER JOIN   Table_2 ON Table_1.acctnum = Table_2.acctnum 
where Table_1.sub=0
group by Table_1.acctnum

this is the result

12345   1   john        doe         xxxxxxxxxx  EAST      
12346   0   jane        doe         xxxxxxxxxx  BOTTOM    
12347   0   rob         roy         xxxxxxxxxx  BALLOON   
12348   0   paul        smith       xxxxxxxxxx  NORTH  

if you change min to max TOP will be insted of BOTTOM on the second row

Upvotes: 2

Adam Robinson
Adam Robinson

Reputation: 185643

If I'm understanding correctly, it looks like you're trying to join on the sub column if it matches. If there's no match on sub, then you want it to select the "first" row for that acctnum. Is this correct?

If so, you'll need to left join on the full match, then perform another left join on a select statement that determines the division that corresponds to the lowest sub value for that acctnum. The row_number() function can help you with this, like this:

select
    t1.acctnum, 
    t1.sub, 
    t1.fname, 
    t1.lname, 
    t1.phone, 
    isnull(t2_match.division, t2_first.division) as division

from table1 t1

left join table2 t2_match on t2_match.acctnum = t1.acctnum and t2_match.sub = t1.sub
left join 
(
    select 
        acctnum, 
        sub, 
        division,
        row_number() over (partition by acctnum order by sub) as rownum

    from table2
) t2_first on t2_first.acctnum = t1.acctnum

EDIT

If you don't care at all about which record you get back from table 2 when a matching sub doesn't exist, you could combine two different queries (one that matches the sub and one that just takes the min or max division) with a union.

select
    t1.acctnum, 
    t1.sub, 
    t1.fname, 
    t1.lname, 
    t1.phone, 
    t2.division

from table1 t1

join table2 t2 on t2.acctnum = t1.acctnum and t2.sub = t1.sub

union

select
    t1.acctnum, 
    t1.sub, 
    t1.fname, 
    t1.lname, 
    t1.phone, 
    min(t2.division)

from table1 t1

join table2 t2 on t2.acctnum = t1.acctnum
left join table2 t2_match on t2_match.acctnum = t1.acctnum and t2_match.sub = t1.sub

where t2_match.acctnum is null

Personally, I don't find the union syntax any more compelling and you now have to maintain the query in two places. For this reason, I'd favor the row_number() approach.

Upvotes: 3

a1ex07
a1ex07

Reputation: 37364

It may also work for you:

SELECT  t1.acctnum, t1.sub, t1.fname, t1.lname, t1.phone, 
ISNULL(MAX(t2.division),MAX(t3.division)) as division
FROM table_1 t1
LEFT JOIN table_2 t2 ON (t2.acctnum = t1.acctnum AND t1.sub = t2.sub)
LEFT JOIN table_2 t3 ON (t3.acctnum = t1.acctnum)
GROUP BY  t1.acctnum, t1.sub, t1.fname, t1.lname, t1.phone

Upvotes: 1

Related Questions