Reputation: 15
I know in (select...) stuff should be avoided but in my situation I couldn't find another way to do it.
Unfortunately, as the database is huge, I would like to find a better way. So could you please share your experience to optimise the query below ?
Here is an idea of the db
ID OrderRef Product OrderDate ShipmentDate Client
1 111 T-Shirt 1/1/2018 4/1/2018 Georges
2 111 Pull-Over 1/1/2018 6/1/2018 (Unknown)
3 222 Shoes 9/1/2018 15/1/2018 Austin
4 222 T-Shirt 9/1/2018 18/1/2018 (Unknown)
What I need to retrieve is :
Here is my code
SELECT t1.OrderRef, t1.Product, t1.OrderDate, t1.Client, t4.max_date
FROM table1 as t1
RIGHT JOIN (SELECT t2.OrderRef, max(t2.ShipmentDate) as max_date
FROM table1 as t2
WHERE t2.OrderRef in(
SELECT t3.OrderRef
FROM table1 as t3
WHERE t3.Client='Georges')
GROUP BY t2.OrderRef) as t4 on t2.OrderRef=t1.OrderRef
The right join is there to retrieve only the OrderReference linked to Georges, and the subquery to calculate the the Latest ShipmentDate
Thanks in advance for your help, Largo
Upvotes: 0
Views: 69
Reputation: 37477
I'd try to use (correlated) subqueries.
SELECT t11.orderref,
t11.product,
t11.orderdate,
t11.client,
(SELECT max(t13.shipmentdate)
FROM table1 t13
WHERE t13.orderref = t11.orderref) max_date
FROM table1 t11
WHERE t11.orderref = (SELECT t12.orderref
FROM table1 t12
WHERE t12.client = 'Georges');
To support the subquery getting the orderref
an index like
CREATE INDEX table1_c_or
ON table1 (client ASC,
orderref ASC);
should help. That subquery is "pointy" anyway.
For getting the maximum shipmentdate
CREATE INDEX table1_or_sd
ON table1 (orderref ASC,
shipmentdate DESC);
should help. The optimizer should notice that the maximum needs to be retrieved only once as the orderref
is always the same. That index can also be used to support the outer query as orderref
is its first key.
Optionally creating an additional index like
CREATE INDEX table1_or_p_od_c
ON table1 (orderref ASC,
product ASC,
orderdate ASC,
client ASC);
might even better support the outer query. But that's not a necessity I guess.
Upvotes: 0
Reputation: 48850
You seem to be retrieving a few rows only on each call. This should be pretty fast even if you have billions of rows. I would do something like:
with
a as (
select max(orderref) as orderref
from t
where client = 'Georges'
),
b as (
select t.*
from t
join a on a.orderref = t.orderref
),
c as (
select max(shipmentdate) as maxshipmentdate from b
)
select b.*, c.maxshipmentdate
from b
cross join c
The query above should be pretty fast is you have the following indexes:
create index ix1 on t (client);
create index ix2 on t (orderref);
Upvotes: 2