martinarroyo
martinarroyo

Reputation: 9701

Retrieve additional columns on aggregation and date operator

I have the following PostgreSQL table structure, which gathers temperature records for every second:

+----+--------+-------------------------------+---------+
| id | value  |             date              | station |
+----+--------+-------------------------------+---------+
|  1 |      0 | 2017-08-22 14:01:09.314625+02 |       1 |
|  2 |      0 | 2017-08-22 14:01:09.347758+02 |       1 |
|  3 | 25.187 | 2017-08-22 14:01:10.315413+02 |       1 |
|  4 | 24.937 | 2017-08-22 14:01:10.322528+02 |       1 |
|  5 | 25.187 | 2017-08-22 14:01:11.347271+02 |       1 |
|  6 | 24.937 | 2017-08-22 14:01:11.355005+02 |       1 |
| 18 | 24.875 | 2017-08-22 14:01:17.35265+02  |       1 |
| 19 | 25.187 | 2017-08-22 14:01:18.34673+02  |       1 |
| 20 | 24.875 | 2017-08-22 14:01:18.355082+02 |       1 |
| 21 | 25.187 | 2017-08-22 14:01:19.361491+02 |       1 |
| 22 | 24.875 | 2017-08-22 14:01:19.371154+02 |       1 |
| 23 | 25.187 | 2017-08-22 14:01:20.354576+02 |       1 |
| 30 | 24.937 | 2017-08-22 14:01:23.372612+02 |       1 |
| 31 |      0 | 2017-08-22 15:58:53.576238+02 |       1 |
| 32 |      0 | 2017-08-22 15:58:53.590872+02 |       1 |
| 33 | 26.625 | 2017-08-22 15:58:54.59986+02  |       1 |
| 38 | 26.375 | 2017-08-22 15:58:56.593205+02 |       1 |
| 39 |      0 | 2017-08-21 15:59:40.181317+02 |       1 |
| 40 |      0 | 2017-08-21 15:59:40.190221+02 |       1 |
| 41 | 26.562 | 2017-08-21 15:59:41.182622+02 |       1 |
| 42 | 26.375 | 2017-08-21 15:59:41.18905+02  |       1 |
+----+--------+-------------------------------+---------+

I want now to retrieve the maximum value for every hour, along with the data associated to that entry (id, date). As such, I tried the following:

select max(value) as m, (date_trunc('hour', date)) as d
from temperature
where station='1'
group by (date_trunc('hour', date));

Which works fine (fiddle), but I only get the columns m and d as a result. If I now try to add the date or id columns to the SELECT statement, I get the usual column "temperature.id" must appear in the GROUP BY clause or be used in an aggregate function error.

I have already tried approaches such as the ones described here, unfortunately to no avail, as for instance I seem to be unable to perform a join on the date_trunc-generated columns.

The result I am aiming for is this:

+----+--------+-------------------------------+---------+
| id | value  |             date              | station |
+----+--------+-------------------------------+---------+
|  3 | 25.187 | 2017-08-22 14:01:10.315413+02 |       1 |
| 33 | 26.625 | 2017-08-22 15:58:54.59986+02  |       1 |
| 41 | 26.562 | 2017-08-21 15:59:41.182622+02 |       1 |
+----+--------+-------------------------------+---------+

It does not matter which record was retrieved in case two or more entries have the same value.

Upvotes: 1

Views: 40

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125344

distinct on:

select distinct on (date_trunc('hour', date)) *
from temperature
where station = '1'
order by date_trunc('hour', date), value desc

Fiddle

Upvotes: 1

Related Questions