darkpool
darkpool

Reputation: 14641

Using WHERE clause and DISTINCT ON

I have the following two postgresql tables:

table: daily

id   date         close   symbol_id
1    2016-05-01   80      65
2    2016-05-01   75      67
3    2016-05-01   95      45
4    2016-05-02   11      65
5    2016-05-02   48      67
6    2016-05-02   135     45
7    2016-05-03   18      65
8    2016-05-03   82      67
9    2016-05-03   107     45
10   2016-05-04   29      65

table: symbol

id   symbol
65   abc
67   xyz
45   jkl

I need to select all symbols where the close value is less than 100 for the latest date for each symbol. As per the example, not all symbols will have the same latest date.

The following query gives me correct data when I do not use the WHERE clause:

SELECT DISTINCT ON (daily.symbol_id) symbol.symbol, daily.close, daily.date
FROM daily JOIN symbol ON daily.symbol_id = symbol.id
--WHERE daily.close < 100
ORDER BY daily.symbol_id, daily.date DESC

Result:

symbol   close   date
abc      29      2016-05-04
xyz      82      2016-05-03
jkl      107     2016-05-03

The problem comes when I uncomment the WHERE clause. The desired result is for the symbol jkl to be removed from the list because the value for close for that symbol on its latest date is not < 100. However this is what happens:

symbol   close   date
abc      29      2016-05-04
xyz      82      2016-05-03
jkl      95      2016-05-01

Upvotes: 2

Views: 69

Answers (2)

sgeddes
sgeddes

Reputation: 62831

You can move your existing query to a subquery and then filter with where criteria.

select * 
from (
    select distinct on (d.symbol_id) s.symbol, d.close, d.date
    from daily d 
        join symbol s on d.symbol_id = s.id
    order by daily.symbol_id, daily.date desc
) t
where close < 100

Here's another similar option using a windows function such as row_number:

select *
from (
    select d.symbol_id, s.symbol, d.close, d.date,
           row_number() over (partition by d.symbol_id order by d.date desc) rn
    from daily d 
        join symbol s on d.symbol_id = s.id
    ) t
where rn = 1 and close < 100

Upvotes: 1

Steve
Steve

Reputation: 11963

Code not tested, just to demonstrate idea

First you make a query to get the latest date of every symbol. Then make a join to filter out rows that are not latest which you can safely apply the close < 100 where clause.

SELECT DISTINCT ON(symbol) * FROM (
SELECT MAX(d1.date) latest FROM daily d1 GROUP BY d1.symbol_id
INNER JOIN daily d2 ON latest = d2.date AND d1.symbol_id = d2.symbol_id) t
WHERE close <100

Upvotes: 0

Related Questions