Reputation: 2716
Sample data from my table test_table
:
date symbol value created_time
2010-01-09 symbol1 101 3847474847
2010-01-10 symbol1 102 3847474847
2010-01-10 symbol1 102.5 3847475500
2010-01-10 symbol2 204 3847474847
2010-01-11 symbol1 109 3847474847
2010-01-12 symbol1 105 3847474847
2010-01-12 symbol2 206 3847474847
Given the table above, I am trying to find the optimal index to put on the table (date, symbol, value and created_time should combined be unique) and the query to go along with it to return the following:
date symbol value created_time
2010-01-09 symbol1 101 3847474847
2010-01-10 symbol1 102.5 3847475500
2010-01-10 symbol2 204 3847474847
2010-01-11 symbol1 109 3847474847
2010-01-12 symbol1 105 3847474847
2010-01-12 symbol2 206 3847474847
I am looking for date, symbol, value columns of data for each group of those three with the maximum created_time column (essentially row 1, 3, 4, 5, 6, 7 in the example above returned).
Currently I have tried this index...
CREATE UNIQUE INDEX "test_table_date_symbol_value_created_time"
ON "test_table" USING btree (date, symbol, value, created_time)
And am using this query. Not sure if it is the most effective way, it still seems pretty slow.
select *
from(
select date,
symbol,
value,
created_time,
max(created_time) over (partition by date, symbol) as max_created_time
from "test_table"
) t
where symbol in ('symbol1', 'symbol2') and created_time = max_created_time
Upvotes: 1
Views: 2677
Reputation: 425418
Postgres supports window functions that suit this situation:
select date, symbol, value, created_time
from (select *,
rank() over (partition by date, symbol order by created_time desc) as rownum
from test_table) x
where rownum = 1
For every combination of date
, symbol
, this query returns the value
and created_time
from the row with the highest (ie last) created_time
of that date
and symbol
.
I would suggest this index:
CREATE UNIQUE INDEX test_table_idx
ON test_table (date, symbol, created_time, value)
It's a covering index (has all values you need for the query, obviating the need to access the actual table, and which you already had), but note that created_time
comes before value
, so data is already in its partition order, and value
is the least important attribute, because it doesn't participate in any determination of which row to return.
Upvotes: 1
Reputation: 125534
Postgresql has the very handy distinct on
select distinct on (symbol, date) *
from t
order by symbol, date, created_time desc
https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT
Upvotes: 1
Reputation: 44250
Alternative:
SELECT * FROM test_table tt
WHERE NOT EXISTS (
SELECT * FROM test_table nx
WHERE nx."date" = tt."date"
AND nx.symbol = tt.symbol
AND nx.created_time > tt.created_time
);
Upvotes: 0