Sven Keinath
Sven Keinath

Reputation: 381

Select individual value in addition to range in group by clause

My table looks like this

   id   | job_id   | measurement_id   |      value       | timestamp
--------+----------+------------------+------------------+-----------
      1 | myJobId  | myMeasurementId  | 55.6111297607422 |    49212
      2 | myJobId  | myMeasurementId  | 55.6607780456543 |    49412
  

I'd like to get the max value in certain time ranges, however I also need the timestamp associated with the individual measurement.

  range   |         max         |     timestamp
----------+---------------------+------------------
        0 | 68.7833782357044665 |       20535552
   377613 | 67.9283221509389440 |       20535552
   755226 | 67.2932168855194184 |       20535552
 

I tried the following statement but run into an error

select (timestamp/377613)*377613 as range, max(value), timestamp
from measurements
where job_id = 'myJobId' and measurement_id = 'myMeasurementId'
group by timestamp/377613
order by 1;


ERROR:  column "measurements.timestamp" must appear in the GROUP BY clause or be used in an aggregate function

Without the second timestamp the statement works well but doesnt deliver the timestamp associated with the max value of course. How can I get that value?

Upvotes: 2

Views: 895

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269973

In Postgres, you should use distinct on for this operation. For convenience, you can also calculate the range using a lateral join:

select distinct on (v.range) v.range, m.value), m.timestamp
from measurements m cross join lateral
     (values (m.timestamp/377613) * 377613)) as range
where m.job_id = 'myJobId' and m.measurement_id = 'myMeasurementId'
order by v.range, m.timestamp;

distinct on is a Postgres extension -- but one that happens to be quite helpful. It sorts the data and chooses the first row that it encounters for each value in the distinct on clause.

In general, distinct on is faster than the corresponding query using window functions.

Upvotes: 2

Fahmi
Fahmi

Reputation: 37473

You can use row_number()

select * from
(
select (timestamp/377613)*377613 as range, value, timestamp,
       row_number() over(partition by (timestamp/377613)*377613 order by value desc) as rn
from measurements
where job_id = 'myJobId' and measurement_id = 'myMeasurementId'
)A when rn=1

Upvotes: 2

Related Questions