Stephanus DJ
Stephanus DJ

Reputation: 109

How to filter or split a CTE so that 2 rows are not added with the same value in a specific column

So the title sounds convoluted because my problem kinda is.

I have a CTE that pulls in some values (LineId, OrderNumber, OrderLine, Type, BuildUsed)

Later on a have a Select that populates a view that does a join on the CTE with something like this

left join CTE C on C.LineId = (select top 1 lineId from CTE C2 where C2.orderNumber = orderNumber and C2.orderLine = orderLine order by LineId

An example of my data would look like

LineId = 10, Order : OIP001, Line = 1, Type = Active, BuildUsed = XE9

LineId = 80, Order : OIP001, Line = 1, Type = Inactive, BuildUsed = XB2

The CTE does a Select, Union, Select. The first select gets all the active entries and the 2nd select gets all the inactive entries.

Any given order could have both active or inactive or just 1 of them.

The issue I am having is that my runtime is bad. It runs in close to 20 seconds when it should be like 4 or 5. The issue is that the join I listed above has to search and order every time and its a huge time sink.

So i thought if there was a way to basically break the CTE into 2 steps.

  1. Insert all the active orders (These are the ones that I would want to pick if they are available)
  2. Insert all the inactive orders (If that ordernumber and orderline does not already exist in the first step)

That way I don't have to order and sort every single join but I can just do a normal join thats significantly faster.

If it helps at all the LineId is based on a rownumber() in the CTE that looks like

ROW_NUMBER() OVER(ORDER BY Type desc, DescriptionStatus asc) as LineId

So the LineId is already ordered correctly.

Is there any way to split the CTE so that my 2nd part of the select can check if the ordernumber and orderline alraedy exists in the first part?

To specify. I would like to find any Active entries for the ordernumber and orderline first and then if none are found, try the inactive entries.

WHAT I HAVE TRIED SO FAR :

I tried adding the query for the 2nd part into the first part as a where clause. So it would only add where it wouldn't exist in the first part. But the time of the query got so insane I just stopped running it and scrapped that idea.

Upvotes: 0

Views: 54

Answers (1)

Eric Brandt
Eric Brandt

Reputation: 8101

I believe you're just looking for a WHERE NOT EXISTS that uses a correlated sub-query to eliminate rows from your second result set that you've already retrieved in your first result set.

WHERE NOT EXISTS is generally pretty performant, but test the CTE by itself to be sure it meets your needs.

Something similar to this:

WITH cte
  AS
  (
    SELECT 
      act.LineID, 
      act.OrderNumber, 
      act.OrderLine, 
      act.Type, 
      act.BuildUsed 
    FROM 
      ActiveSource AS act
    UNION ALL
    SELECT
      inact.LineID
     ,inact.OrderNumber
     ,inact.OrderLine
     ,inact.Type
     ,inact.BuildUsed
    FROM
      InactiveSource AS inact
    WHERE
      NOT EXISTS
      (
        SELECT
          1
        FROM
          ActiveSource AS a
        WHERE
          a.OrderNumber = inact.OrderNumber
          AND a.OrderLine = inact.OrderLine
      )
  )
SELECT * FROM cte;

Upvotes: 1

Related Questions