Georgi Bonchev
Georgi Bonchev

Reputation: 300

PostgreSQL group by with interval

Well, I have a seemingly simple set of data but it gives me a lot of trouble.

This is an example of what my data look like:

quantity  price1  price2 date
100       1       0      2018-01-01 10:00:00
200       1       0      2018-01-02 10:00:00
50        5       0      2018-01-02 11:00:00
100       1       1      2018-01-03 10:00:00
100       1       1      2018-01-03 11:00:00
300       1       0      2018-01-03 12:00:00

I need to sum up "quantity" column grouped by "price1" and "price2" and it would be very easy but I need to take into account time changes of "price1" and "price2". Data is sorted by "date".

What I need is the last row to be not grouped with the first two although it has the same values for "price1" and "price2". Also I need to get minimal and maximal date of each interval.

The end result should looks like this:

quantity price1 price2 dateStart            dateEnd
300      1      0      2018-01-01 10:00:00  2018-01-02 10:00:00 
50       5      0      2018-01-02 11:00:00  2018-01-02 11:00:00
200      1      1      2018-01-03 10:00:00  2018-01-03 11:00:00
300      1      0      2018-01-03 12:00:00  2018-01-03 12:00:00

Any suggestions for a SQL query?

Upvotes: 0

Views: 1125

Answers (2)

Georgi Bonchev
Georgi Bonchev

Reputation: 300

I changed a little bit the accepted answer to catch the cases when "date" column of two rows next to each other are exactly the same. I added second parameter so they will be ordered in correct order (my table has "oid" column)

select sum(quantity), price1, price2, min(date) dateStart, max(date) dateend 
from
(
    select *,
           row_number() over (order by date, oid) -
           row_number() over (partition by price1, price2 order by date, oid) grp
    from data
) t
group by price1, price2, grp
order by dateStart

Upvotes: 0

Radim Bača
Radim Bača

Reputation: 10701

It is a gap and island problem. Use the following code:

select sum(quantity), price1, price2, min(date) dateStart, max(date) dateend 
from
(
    select *,
           row_number() over (order by date) -
           row_number() over (partition by price1, price2 order by date) grp
    from data
) t
group by price1, price2, grp
order by dateStart

dbfiddle demo

The solution is based on an identification of consecutive sequences of price1 and price2, which is done by a creation of the grp column. Once you isolate the consecutive sequences then you do a simple group by using grp as well.

Upvotes: 1

Related Questions