Reputation: 941

Joining on non unique columns

Table 1

Table 2

table1.col1 is a FK to table2.col2

You want to duplicate values where col2 = A, and have col2 = AA :

1 A  1 1
2 A  1 2
5 A  1 1
6 B  2 1
7 AA 1 1   <- New
8 AA 1 2   <- New
9 AA 1 1   <- New

How do you join Table 2 to the new resultset such that values that existed for A also exist for AA?

Result wanted:

1 A  1 1 | 1 1 12
2 A  1 2 | 2 2 45
5 A  1 1 | 3 5 22
6 B  2 1 | 4 6 21
7 AA 1 1 | 1 1 12
8 AA 1 2 | 2 2 45
9 AA 1 1 | 3 5 22

Upvotes: 0

Answers (3)

gadaju

Reputation: 416

consider each value A/B/AA in isolation and use window functions to find lag lead on col3 and col4

treat each "prev_col3, col3, next_col3, prev_col4, col4, next_col4" as a unique "context" identifier and join on that. this is how we can avoid confusing row 7 with row 9 in the data; they have distinct prev/next lag/lead values for col3 and col4.

We need to control for null cases (I made null into -1) for the joins to work.

You can copy/paste this into SQL server to see it work:

CREATE TABLE #TABLE1 (col1 INT, col2 varchar(5), col3 INT, col4 INT)
CREATE TABLE #TABLE2 (col1 INT, col2 INT, col3 INT)

INSERT INTO #TABLE1
select 1 col1,'A' col2, 1 col3, 1 col4 union
select 2 col1,'A' col2, 1 col3, 2 col4 union
select 5 col1,'A' col2, 1 col3, 1 col4 union
select 6 col1,'B' col2, 2 col3, 1 col4 union
select 7 col1,'AA' col2, 1 col3, 1 col4 union
select 8 col1,'AA' col2, 1 col3, 2 col4 union
select 9 col1,'AA' col2, 1 col3, 1 col4 

INSERT INTO #TABLE2
select 1 col1, 1 col2, 12 col3 union
select 2 col1,2 col2, 45 col3 union
select 3 col1,5 col2, 22 col3 union
select 4 col1,6 col2, 21 col3


select 
    Bu.col1, bu.col2, bu.col3, bu.col4, t2.col1, t2.col2, t2.col3
from 
   (
    select 
       col1, col2,
       lag(col3) over (order by col1 asc) prev_col3,
       col3,
       lead(col3) over (order by col1 asc) next_col3,
       lag(col4) over (order by col1 asc) prev_col4,
       col4,
       lead(col4) over (order by col1 asc) next_col4
    from 
       #TABLE1 t1 where col2 in ('A')
    ) A   
    join 
   ( /*bu big union*/
    select 
       col1, col2,
       lag(col3) over (order by col1 asc) prev_col3,
       col3,
       lead(col3) over (order by col1 asc) next_col3,
       lag(col4) over (order by col1 asc) prev_col4,
       col4,
       lead(col4) over (order by col1 asc) next_col4
    from 
       #TABLE1 t1 where col2 in ('A')
    UNION
    select 
       col1, col2,
       lag(col3) over (order by col1 asc) prev_col3,
       col3,
       lead(col3) over (order by col1 asc) next_col3,
       lag(col4) over (order by col1 asc) prev_col4,
       col4,
       lead(col4) over (order by col1 asc) next_col4
    from 
       #TABLE1 t1 where col2 in ('AA')
    ) bu
    on 
   ( 
    a.col3 = bu.col3 and isnull(a.prev_col3,-1) = isnull(bu.prev_col3,-1) and 
    isnull(a.next_col3,-1) = isnull(bu.next_col3,-1) and 
    a.col4 = bu.col4 and isnull(a.prev_col4,-1) = isnull(bu.prev_col4,-1) and 
    isnull(a.next_col4,-1) = isnull(bu.next_col4 ,-1)
    )
    join
    #TABLE2 t2
    on 
    a.col1 = t2.col2
    UNION
    select 
       t1.col1, t1.col2, t1.col3, t1.col4,
       t2.col1, t2.col2, t2.col3
    from 
       #TABLE1 t1 
       join #TABLE2 t2 on t1.col1 = t2.col2
    where t1.col2 = 'B'
    order by 1 asc

drop table #TABLE1
drop table #TABLE2

Upvotes: 0

JBrooks

Reputation: 10013

    declare @maxCol1 int

    select @maxCol1 = max(col1)
    from table1


    select a.col1,
    a.col2,
    a.col3,
    a.col4,
    b.col1,
    b.col2,
    b.col3
    from table1 a
    join table2 b
    on a.col1 = b.col2

    union all

    select a.col1 + @maxCol1 as col1,
    a.col2 + a.col2 as col2,
    a.col3,
    a.col4,
    b.col1,
    b.col2,
    b.col3
    from table1 a
    join table2 b
    on a.col1 = b.col2
    where a.col2 = 'A'

Upvotes: 0

xQbert

Reputation: 35343

Untested:

The logic seems solid, but I don't have data, test environment to try this... You could use a union and (inline view or a common table expression) to accomplish this.

First we build table1 with both sets of desired data (inline view A below). This approach makes the join simple. This is accomplished using a union statement and hard coding AA value while limiting the set to only A's, then unioning in the base set.

We then join back to table2 as normal.

I used row_number() and over order by col 2 to identify the individual values to increment the max ID by. 1 for first row of a 2 for second row of a and 3 for 3rd row of a based on a seed of 6 which is the max value in table1 for.

I used parent_ID to always identify the related record to join to table2.

Inline view

Select *  --(though you should spell out desired columns)
from (Select ROW_NUMBER() OVER(ORDER BY Col2)+C.mID, 'AA', col3, col4, col1 as Parent_ID
      from table1
      CROSS JOIN (select max(col1) mID from table1) C  
      where table1.col2 = 'A'
  record 
      UNION ALL
      Select col1, Col2, col3, col4, col1 as Parent_ID
      from table1)  A

INNER JOIN table2
 on table2.col2 = A.parent_ID

CTE:

With cte as (Select  ROW_NUMBER() OVER(ORDER BY Col2)+C.mID col1, 'AA' col2, col3, col4, col1 as Parent_Id
    FROM table1
    CROSS JOIN (select max(col1) mID from table1) C
    WHERE table1.col2 = 'A'
    UNION ALL
    SELECT col1, Col2, col3, col4, col1 as Parent_Id
    from table1) 

 SELECT * --(though you should spell out desired columns)
 FROM cte
 INNER JOIN table2
   on table2.col2 = cte.Parent_Id

Upvotes: 1

Joining on non unique columns

Answers (3)

Related Questions