SELECT DISTINCT ON in Postgresql returns multiple rows

Question

When I run

SELECT DISTINCT ON (ts) * 
FROM tbl 
WHERE clause=100 
AND ts <= '2018-04-02 15:11:18.819000' 
AND ts > '2018-04-02 15:06:18.819000' 
ORDER BY ts, version ASC;

I expect a single row corresponding to the largest ts that meets the condition, with ties being broken by taking the lowest version.

The query returns

         ts                     version
'2018-04-02 15:07:04.828'   ...   1
'2018-04-02 15:07:05.706'   ...   1

I don't understand why two non-distinct ts are being returned. Is there a similar query that will return the desired result?

Gordon Linoff · Accepted Answer

Distinct on returns one row for each combination of the keys following the clause. In this case, each ts would appear once. Which one? That is determined by the order by. The keys following the distinct on keys determine the first, so that would be the lowest version number.

Your query appears to be producing reasonable results.

If you want one row in the result set with the largest ts, then use order by and limit/fetch first 1 row only:

SELECT * 
FROM tbl 
WHERE clause = 100 AND
      ts <= '2018-04-02 15:11:18.819000' AND
      ts > '2018-04-02 15:06:18.819000' 
ORDER BY ts DESC, version ASC
FETCH FIRST 1 ROW ONLY;

Note the ts DESC in the ORDER BY, to get the most recent value of ts.

SELECT DISTINCT ON in Postgresql returns multiple rows

Answers (2)

Related Questions