Reputation: 48
I have a query like:
SELECT foo_id, bar, timestamp, ROW_NUMBER() OVER (ORDER BY timestamp ASC)
FROM foo_table
WHERE (foo_id = '1' OR related_foo_id = '1')
AND foo_body -> 'type' = 'My_foo_type';
And I get this:
╔═════════╦══════════╦══════════════╦══════════════╦
║ foo_id ║ bar ║ timestamp ║ row_number ║
╠═════════╬══════════╬══════════════╬══════════════╬
║ 1 ║ 1 ║ 10 ║ 1 ║
║ 1 ║ 1 ║ 11 ║ 2 ║
║ 2 ║ 1 ║ 15 ║ 3 ║
║ 1 ║ 2 ║ 25 ║ 4 ║
║ 1 ║ 2 ║ 26 ║ 5 ║
╚═════════╩══════════╩══════════════╩══════════════╩
I want to group by 'foo_id' and 'bar' values to get something like this:
╔═════════╦══════════╦══════════════╦══════════════╦
║ foo_id ║ bar ║ timestamp ║ row_number ║
╠═════════╬══════════╬══════════════╬══════════════╬
║ 1 ║ 1 ║ 10 ║ 1 ║
║ 2 ║ 1 ║ 15 ║ 2 ║
║ 1 ║ 2 ║ 25 ║ 3 ║
╚═════════╩══════════╩══════════════╩══════════════╩
Grouping by foo_id and bar I have to get rid of timestamp column, but I need to order by it. I don't really care if I get the first row or the second one always that (foo_id, bar) is unique in the result.
I tried ordering and grouping the result:
SELECT A.foo_id, A.bar, ROW_NUMBER() OVER ()
FROM (
SELECT foo_id, bar FROM foo_table
WHERE (foo_id = '1' OR related_foo_id = '1')
AND foo_body -> 'type' = 'My_foo_type';
ORDER BY timestamp ASC) A
GROUP BY foo_id, bar;
But it doesn't respect the order:
╔═════════╦══════════╦══════════════╦══════════════╦
║ foo_id ║ bar ║ timestamp ║ row_number ║
╠═════════╬══════════╬══════════════╬══════════════╬
║ 1 ║ 1 ║ 10 ║ 1 ║
║ 1 ║ 2 ║ 25 ║ 2 ║
║ 2 ║ 1 ║ 15 ║ 3 ║
╚═════════╩══════════╩══════════════╩══════════════╩
Also tried different joins following the answers in other questions, but I didn't find the right one. I always get the same result of the first select.
Is there a way to get what I want without impacting too much in the performance? Thank you
Upvotes: 1
Views: 288
Reputation: 1270391
I think you want distinct on
:
SELECT DISTINCT ON (foo_id, bar) foo_id, bar, timestamp, ROW_NUMBER() OVER (ORDER BY timestamp ASC)
FROM foo_table
WHERE (foo_id = 1 OR related_foo_id = 1) AND
foo_body -> 'type' = 'My_foo_type'
ORDER BY foo_id, bar, timestamp asc;
This will return one row per foo_id
/bar
combination -- the one with the lowest timestamp. This is based on the order by
and distinct on
clauses.
DISTINCT ON
is a Postgres extension that is quite handy and can be the most efficient way to execute such a query.
Upvotes: 1