Reputation: 2241
I have a very simple query that is taking way to long to return the top 20 rows.
SELECT o.OrderId,
p.PaymentDate
FROM dbo.Orders o -- 6 million records
LEFT JOIN dbo.Payments p ON p.OrderId = o.OrderId -- 3.5 million records
WHERE o.IsDeleted = 0 -- There is an index on this column.
ORDER BY p.PaymentDate DESC -- There is an index on this column and on OrderId
OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY
The execution plan shows an index scan on both Orders and Payments table with 6 million rows read from Orders and 3 million rows read from the payments table.
It is taking around 4-5 seconds to return 20 rows.
How can I make this query return result faster?
Upvotes: 1
Views: 588
Reputation: 1269743
You are sorting by the payment date -- so it wuld seem that you need a match. Hence, an inner join
is sufficient.
In addition, OrderId
is in the payments table. So, you can try this version:
SELECT p.OrderId, p.PaymentDate
FROM dbo.Payments p
WHERE EXISTS (SELECT 1
FROM dbo.Orders o
WHERE p.OrderId = o.OrderId AND
o.IsDeleted = 0
)
ORDER BY p.PaymentDate DESC
OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY;
Then for this query, you want an indexes on:
Payments(PaymentDate DESC, OrderId)
Orders(OrderId, IsDeleted)
.Upvotes: 1
Reputation: 74605
You're sorting a huge dataset just to retrieve 20 rows
If this is a typical order system then IsDeleted=0 probably doesn't help much, because what percentage of orders would be deleted? 0.1%? Excluding them makes little difference to the number of orders you're sorting
I'd put a paymentdate criteria in there WHERE too, cut the record count down a bit so that a much smaller set is being sorted. The logic you're after at the moment is that you want 20 records, but you don't need to sort 6 million records to get your 20. Cutting it down to 20 would do. This is clearly a busy ordering system, and probably at least 20 orders a day are being submitted so you may find you get your needed 20 by selecting only records with a paymentdate since one day ago, but pick a good number of days to give your 20, and maybe even implement some logic where your front end app increases the number of days if you don't get your 20 first time round. You can cache this lookback value in the front end and decrease it again as the days go by, so it becomes an automatically derived figure
The other issue I think you have is that you've made it a LEFT JOIN, yet I can't see the sense in doing so- you're after the most recently paid for orders, right? And there is no payment record for unpaid orders? Whatever the reason why there are only 3.5 million payments for 6 million orders, you don't seem to be interested in the orders that have no payment record, so I can't see the sense in having it be a left join that includes and sorts 2.5 million records you definitely don't want. Make it an inner join
Upvotes: 2