uzla
uzla

Reputation: 525

How to query just the last record of every second within a period of time in postgres

I have a table with hundreds of millions of records in 'prices' table with only four columns: uid, price, unit, dt. dt is a datetime in standard format like '2017-05-01 00:00:00.585'.

I can quite easily to select a period using

SELECT uid, price, unit from prices
WHERE dt > '2017-05-01 00:00:00.000' 
AND dt < '2017-05-01 02:59:59.999'

What I can't understand how to select price for every last record in each second. (I also need a very first one of each second too, but I guess it will be a similar separate query). There are some similar example (here), but they did not work for me when I try to adapt them to my needs generating errors.

Could some please help me to crack this nut?

Upvotes: 1

Views: 64

Answers (1)

krokodilko
krokodilko

Reputation: 36107

Let say that there is a table which has been generated with a help of this command:

CREATE TABLE test AS
SELECT timestamp '2017-09-16 20:00:00' + x * interval '0.1' second As my_timestamp
from generate_series(0,100) x

This table contains an increasing series of timestamps, each timestamp differs by 100 milliseconds (0.1 second) from neighbors, so that there are 10 records within each second.

|           my_timestamp |
|------------------------|
|   2017-09-16T20:00:00Z |
| 2017-09-16T20:00:00.1Z |
| 2017-09-16T20:00:00.2Z |
| 2017-09-16T20:00:00.3Z |
| 2017-09-16T20:00:00.4Z |
| 2017-09-16T20:00:00.5Z |
| 2017-09-16T20:00:00.6Z |
| 2017-09-16T20:00:00.7Z |
| 2017-09-16T20:00:00.8Z |
| 2017-09-16T20:00:00.9Z |
|   2017-09-16T20:00:01Z |
| 2017-09-16T20:00:01.1Z |
| 2017-09-16T20:00:01.2Z |
| 2017-09-16T20:00:01.3Z |
.......

The below query determines and prints the first and the last timestamp within each second:

SELECT my_timestamp,
       CASE
           WHEN rn1 = 1 THEN 'First'
           WHEN rn2 = 1 THEN 'Last'
           ELSE 'Somwhere in the middle'
        END as Which_row_within_a_second
FROM (
   select *,
       row_number() over( partition by date_trunc('second', my_timestamp)
                          order by my_timestamp
       ) rn1,
       row_number() over( partition by date_trunc('second', my_timestamp)
                          order by my_timestamp DESC
       ) rn2       
   from test
) xx
WHERE 1 IN (rn1, rn2 )
ORDER BY my_timestamp
;

|           my_timestamp | which_row_within_a_second |
|------------------------|---------------------------|
|   2017-09-16T20:00:00Z |                     First |
| 2017-09-16T20:00:00.9Z |                      Last |
|   2017-09-16T20:00:01Z |                     First |
| 2017-09-16T20:00:01.9Z |                      Last |
|   2017-09-16T20:00:02Z |                     First |
| 2017-09-16T20:00:02.9Z |                      Last |
|   2017-09-16T20:00:03Z |                     First |
| 2017-09-16T20:00:03.9Z |                      Last |
|   2017-09-16T20:00:04Z |                     First |
| 2017-09-16T20:00:04.9Z |                      Last |
|   2017-09-16T20:00:05Z |                     First |
| 2017-09-16T20:00:05.9Z |                      Last |

A working demo you can find here

Upvotes: 1

Related Questions