Reputation: 327
+-------+----------------------+----------+------------------+
| isbn | book_container_id | shelf_id | update_time |
+-------+----------------------+----------+------------------+
| 555 | 6 | shelf100 | 11/15/2015 19:10 |
| 123 | 1 | shelf1 | 11/28/2015 8:00 |
| 555 | 4 | shelf5 | 11/28/2015 9:10 |
| 212 | 2 | shelf2 | 11/29/2015 8:10 |
| 555 | 6 | shelf9 | 11/30/2015 22:10 |
| 321 | 8 | shelf7 | 11/30/2015 8:10 |
| 555 | 4 | shelf33 | 12/1/2015 7:00 |
+-------+----------------------+----------+------------------+
Let's say I have a table (PostgreSQL) like the above called bookshelf_configuration
. If I'm given an ISBN and a timestamp, I want to be able to find the closest (before only) records for each unique combination of isbn
and book_container_id
.
So if I'm looking at isbn
'555', with a timestamp of '12/1/2015 7:00', I should get back:
+-------+----------------------+----------+------------------+
| isbn | book_container_id | shelf_id | update_time |
+-------+----------------------+----------+------------------+
| 555 | 6 | shelf9 | 11/30/2015 22:10 |
| 555 | 4 | shelf33 | 12/1/2015 7:00 |
+-------+----------------------+----------+------------------+
My knowledge of SQL is extremely basic. I've got a query that would work if I only had to factor in isbn, but I need some help understanding how to do this for the combination (isbn, book_container_id)
.
Upvotes: 1
Views: 1361
Reputation: 656882
A typical use case for DISTINCT ON
:
SELECT DISTINCT ON (book_container_id)
isbn, book_container_id, shelf_id, update_time
FROM bookshelf_configuration
WHERE isbn = 555
AND update_time <= '2015-12-01 07:00' -- ISO 8601 format
ORDER BY book_container_id, update_time DESC;
Assuming update_time
is defined NOT NULL
, or you have to add NULLS LAST
. Detailed explanation:
Depending on cardinalities and value frequencies there may be even faster query styles:
Either way, a multicolumn index on (isbn, book_container_id, update_time DESC)
is the key to make this fast for tables of non-trivial size. Sort order should match the query (or be it's complete inversion). If you add NULLS LAST
to the query, add it to the index as well.
Aside: It's better to use ISO 8601 format for all date / time constants, since that is unambiguous with any locale or datestyle setting. Related:
Upvotes: 2
Reputation: 13949
There is a thing called Row_Number
that can help you here.
Select *
From (
Select *,
row_number() OVER (partition by isbn, book_container_id order by update_time desc) rn
From bookshelf_configuration
Where isbn = 555 and update_time <= '12/1/2015 7:00'
) q
Where q.rn = 1
Upvotes: 2