Optimize a MAX aggregated query involving various tables

Question

I have some tables that are more or less like this (I'll use a simpler domain so the explanation is clearer):

Trades
----------
ID
Seller_ID
Trade_Date

Sellers
-------
ID
Department_ID

And I want to get the latest trade made by each department. The query is something like:

SELECT Department_ID, MAX(Trade_Date) FROM 
Trades, Sellers
WHERE Trades.Seller_ID = Sellers.ID
GROUP BY Sellers.Department_ID

The table Trades has an index on the date, so it can be used to speed up the queries, but I noticed that the query runs fast for some departments (hardcoding the id) and very slow for others.

I've deduced that it happens because of the huge difference of volume of trades by each department. The database is doing a sequential scan on the sorted index to get the first occurrence, and those departments that made their latest sell long time ago will need to go very far in the index.

My current solution is storing the latest query results in an auxiliary table, and make the new queries incremental (filtering by the latest date that was already in the auxiliary table). It solves the problem, as the query is run very frequently and now the index scan just needs to take into account a few seconds worth of trades.

But I think there should be a more elegant solution to this. I know that if the aggregation was made by Seller rather than Department, a compound index would definitively help, but I don't think it is allowed to build indices that spawn different tables...

Optimize a MAX aggregated query involving various tables

Answers (1)

Related Questions