user2112153
user2112153

Reputation: 131

Efficiency of joining subqueries in SQL Server

I have a customers and orders table in SQL Server 2008 R2. Both have indexes on the customer id (called id). I need to return details about all customers in the customers table and information from the orders table, such as details of the first order.

I currently left join my customers table on a subquery of the orders table, with the subquery returning the information I need about the orders. For example:

SELECT c.id
        ,c.country      
        ,First_orders.product
        ,First_orders.order_id
FROM customers c

LEFT JOIN   SELECT( id, 
                    product 
            FROM (SELECT    id
                            ,product
                            ,order_id
                            ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No 
                        FROM orders) orders
            WHERE Order_no = 1) First_Orders
ON c.id = First_orders.id

I'm quite new to SQL and want to understand if I'm doing this efficiently. I end up left joining quite a few subqueries like this onto the customers table in one select query and it can take tens of minutes to run.

So am I doing this efficiently or can it be improved? For example, I'm not sure if my index on id in the orders table is of any use and maybe I could speed up the query by creating a temporary table of what is in the subquery first and creating a unique index on id in the temporary table so SQL Server knows id is now a unique column and then joining my customers table to this temporary table? I typically have one or two million rows in the customers and orders tables.

Many thanks in advance!

Upvotes: 12

Views: 73102

Answers (3)

Utsav Anand
Utsav Anand

Reputation: 11

Try this...its easy to understand

;WITH cte
AS (
    SELECT id
        ,product
        ,order_id
        ,ROW_NUMBER() OVER (
            PARTITION BY id ORDER BY Order_Date ASC
            ) AS order_No
    FROM orders
    )
SELECT c.id
    ,c.country
    ,c1.Product
    ,c1.order_id
FROM customers c
INNER JOIN cte c1 ON c.id = c1.id
WHERE c1.order_No = 1

Upvotes: 1

sgeddes
sgeddes

Reputation: 62831

You can remove one of your subqueries to make it a little more efficient:

SELECT c.id
        ,c.country      
        ,First_orders.product
        ,First_orders.order_id
FROM customers c
   LEFT JOIN  (SELECT id
                    ,product
                    ,order_id
                    ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No 
               FROM orders) First_Orders
     ON c.id = First_orders.id AND First_Orders.order_No = 1

In your above query, you need to be careful where you place your parentheses as I don't think it will work. Also, you're returning product in your results, but not including in your nested subquery.

Upvotes: 9

Gordon Linoff
Gordon Linoff

Reputation: 1269493

For someone who is just learning SQL, your query looks pretty good.

The index on customers may or may not be used for the query -- you would need to look at the execution plan. An index on orders(id, order_date) could be used quite effectively for the row_number function.

One comment is on the naming of fields. The field orders.id should not be the customer id. That should be something like 'orders.Customer_Id`. Keeping the naming system consistent across tables will help you in the future.

Upvotes: 4

Related Questions