Trevor Nederlof
Trevor Nederlof

Reputation: 2716

Return rows that are max of one column in Postgresql

Sample data from my table test_table:

date           symbol      value      created_time
2010-01-09     symbol1     101        3847474847
2010-01-10     symbol1     102        3847474847
2010-01-10     symbol1     102.5      3847475500
2010-01-10     symbol2     204        3847474847
2010-01-11     symbol1     109        3847474847
2010-01-12     symbol1     105        3847474847
2010-01-12     symbol2     206        3847474847

Given the table above, I am trying to find the optimal index to put on the table (date, symbol, value and created_time should combined be unique) and the query to go along with it to return the following:

date           symbol      value      created_time
2010-01-09     symbol1     101        3847474847
2010-01-10     symbol1     102.5      3847475500
2010-01-10     symbol2     204        3847474847
2010-01-11     symbol1     109        3847474847
2010-01-12     symbol1     105        3847474847
2010-01-12     symbol2     206        3847474847

I am looking for date, symbol, value columns of data for each group of those three with the maximum created_time column (essentially row 1, 3, 4, 5, 6, 7 in the example above returned).

Currently I have tried this index...

CREATE UNIQUE INDEX "test_table_date_symbol_value_created_time" 
    ON "test_table" USING btree (date, symbol, value, created_time)

And am using this query. Not sure if it is the most effective way, it still seems pretty slow.

select *
  from(
    select date,
           symbol,
           value,
           created_time,
           max(created_time) over (partition by date, symbol) as max_created_time
     from "test_table"
  ) t
where symbol in ('symbol1', 'symbol2') and created_time = max_created_time

Upvotes: 1

Views: 2677

Answers (3)

Bohemian
Bohemian

Reputation: 425418

Postgres supports window functions that suit this situation:

select date, symbol, value, created_time
from (select *,
    rank() over (partition by date, symbol order by created_time desc) as rownum
  from test_table) x
where rownum = 1

For every combination of date, symbol, this query returns the value and created_time from the row with the highest (ie last) created_time of that date and symbol.

I would suggest this index:

CREATE UNIQUE INDEX test_table_idx 
  ON test_table (date, symbol, created_time, value)

It's a covering index (has all values you need for the query, obviating the need to access the actual table, and which you already had), but note that created_time comes before value, so data is already in its partition order, and value is the least important attribute, because it doesn't participate in any determination of which row to return.

Upvotes: 1

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125534

Postgresql has the very handy distinct on

select distinct on (symbol, date) *
from t
order by symbol, date, created_time desc

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT

Upvotes: 1

wildplasser
wildplasser

Reputation: 44250

Alternative:


SELECT * FROM test_table tt
WHERE NOT EXISTS (
    SELECT * FROM test_table nx
    WHERE nx."date" = tt."date"
    AND nx.symbol = tt.symbol
    AND nx.created_time > tt.created_time
    );

Upvotes: 0

Related Questions