MetaStack
MetaStack

Reputation: 3696

SQL join on CTE with multiple

I have to combine two queries:

Query 1 - a simple inner join where query:

Select t1.c1, t1.c2, t1.c3, t2.c1

from s1.t2 t2
inner join s1.t1 t1 on t2.c6 = t1.c6

where t2.c5 >= '2014-01-01'
and t2.c5 >= '2014-01-01'
and t1.c4 = 'P' 

This returns a table with 4 columns of course. let me draw it out here:

+----+----+----+-------+
| C1 | C2 | C3 | T2.C1 |
+----+----+----+-------+
|    |    |    |       |

Query 2 - With, then remove duplicates with a "partition by" statement:

WITH all_col1 (ONE, TWO) AS
(
    SELECT col1, col2
    FROM Schema1.Table1
    UNION
    SELECT col1, col2
    FROM Schema2.Table2
    UNION
    SELECT col1, col2
    FROM Schema2.Table3
)
SELECT *
FROM (
    SELECT  ONE, TWO,
    ROW_NUMBER() OVER(PARTITION BY ONE ORDER BY ONE) duplicate_count
    FROM all_col1
) all_col1_w_duplicat_count
WHERE duplicate_count = 1 

This returns two columns, with all the duplicates removed, (meaning, if column 1 has duplicates the extra rows are removed explicitly):

+-----+-----+
| ONE | TWO |
+-----+-----+
|     |     |

So what I'd like to see is the first query and the second query combined like this where TWO is added to the table where ONE matches C1:

+------+----+----+-------+-----+
| C1   | C2 | C3 | T2.C1 | TWO |
|(ONE) |    |    |       |     |
+------+----+----+-------+-----+
|      |    |    |       |     |

So, I've tried INTERSECT, adding placeholder rows. that returned empty.

I tried making query 1 a CTE as well then joining the two - got a myriad of errors trying that.

Here's the best I have been able to come up with so far - joining query 1 in with query 2 in its select statement:

--ALL COL1 WITH beginning 

WITH all_col1 (ONE, TWO) AS
(
    SELECT col1, col2
    FROM Schema1.Table1
    UNION
    SELECT col1, col2
    FROM Schema2.Table2
    UNION
    SELECT col1, col2
    FROM Schema2.Table3
)


--ALL COL1 SELECT duplicate_count

SELECT  *
FROM (
    SELECT  ONE, TWO,
    ROW_NUMBER() OVER(PARTITION BY ONE ORDER BY ONE) duplicate_count
    FROM all_col1


    -- added join opperation

    join (


        --SELECT from query 1

        Select t1.c1, t1.c2, t1.c3, t2.c1

        from s1.t2 t2
        inner join s1.t1 t1 on t2.c6 = t5.c6

        where t2.c5 >= '2014-01-01'
        and t2.c5 >= '2014-01-01'
        and t1.c4 = 'P' 


    -- finish join opperation

    ) as query1
    on all_col1.ONE = query1.c1    --  <----MOST IMPORTANT PART


-- finsih ALL COL1

) all_col1_w_duplicat_count
WHERE duplicate_count = 1 

This, unfortunately, doesn't give me the results joined on the first column of each query as was trying to ask with on all_col1.ONE = t1.c1

Instead, it seems like it just gives me the WITH all_col1.

I know almost nothing about sql, What do I need to learn in order to join these two queries together correctly?

Upvotes: 1

Views: 5699

Answers (1)

Parfait
Parfait

Reputation: 107652

Consider joining by two derived tables (subqueries in FROM and JOIN clauses) as your duplicate count includes both tables and not just one:

...
SELECT q.c1, q.c2, q.c3, q.t2_c1, a.[TWO]
FROM
  (
   SELECT *
   FROM (
          SELECT  ONE, TWO,
          ROW_NUMBER() OVER(PARTITION BY ONE ORDER BY ONE) duplicate_count
          FROM all_col1
        ) sub
   WHERE sub.duplicate_count = 1 
  ) a

INNER JOIN 
  (
   SELECT t1.c1, t1.c2, t1.c3, t2.c1 as t2_c1
   FROM s1.t2 t2
   INNER JOIN s1.t1 t1 on t2.c6 = t5.c6
   WHERE t2.c5 >= '2014-01-01'
    AND t2.c5 >= '2014-01-01'
    AND t1.c4 = 'P' 
  ) q

ON a.ONE = q.c1

Upvotes: 1

Related Questions