Axel Siebert
Axel Siebert

Reputation: 29

Sum a Column of a Timeseries by an Order Number when the Ordernumber is not unique

I have a table like this: demo at db<>fiddle

CREATE TABLE test(id, order_id, start, end1, count) AS VALUES
(1, 1, '2023-12-19 10:00:00'::timestamp, '2023-12-19 11:00:00'::timestamp, 15),
(2, 1, '2023-12-19 11:00:00', '2023-12-19 12:00:00', 1),
(3, 2, '2023-12-19 12:00:00', '2023-12-19 13:00:00', 2),
(4, 3, '2023-12-19 13:00:00', '2023-12-19 14:00:00', 10),
(5, 1, '2023-12-19 14:00:00', '2023-12-19 15:00:00', 4),
(6, 1, '2023-12-19 15:00:00', '2023-12-19 16:00:00', 7),
(7, 1, '2023-12-19 16:00:00', '2023-12-19 17:00:00', 3),
(8, 3, '2023-12-19 17:00:00', '2023-12-19 18:00:00', 21),
(9, 1, '2023-12-19 18:00:00', '2023-12-19 19:00:00', 5);

The result I like to get, merging adjacent orders together, summing up their count and extending their time range:

order_id start end count
1 merged from rows 1 and 2 10:00 12:00 16
2 12:00 13:00 2
3 13:00 14:00 10
1 merged from rows 5, 6 and 7 14:00 17:00 14
3 17:00 18:00 21
1 18:00 19:00 5

Last thing I tried:

WITH lp AS (
    SELECT *,LEAD(order_id) OVER(ORDER BY start) AS next_id
    FROM test
)
SELECT order_id,
    MIN(start) AS start,
    MAX(end1) AS end,
    SUM(count) AS count
   FROM lp
    WHERE order_id = next_id
  GROUP BY order_id
  ORDER BY MIN(start);

I know that the query is incorrect. How to include the first line of the table?
Any ideas how to fix?

Upvotes: 1

Views: 66

Answers (1)

samhita
samhita

Reputation: 2830

  1. You can get current order_id and the previous one using lag() window function which is part of ordered_data CTE.
  2. Create a new group every time a new order is found, otherwise contiguous block of the same order_id gets a unique grp value in grouped_data CTE.
  3. In final select based on order_id and grp, get the sum() of count:

Fiddle

WITH ordered_data AS(
    SELECT*, order_id<>LAG(order_id,1,order_id)OVER(ORDER BY start)AS is_diff
    FROM test
),grouped_data AS(
    SELECT*, count(*)filter(where is_diff)OVER(ORDER BY start)AS grp
    FROM ordered_data)
SELECT order_id
     , MIN(start) AS start
     , MAX(end1) AS end1
     , SUM(count) AS count
FROM grouped_data
GROUP BY order_id, grp
ORDER BY MIN(start);
order_id start end1 count
1 2023-12-19 10:00:00 2023-12-19 12:00:00 16
2 2023-12-19 12:00:00 2023-12-19 13:00:00 2
3 2023-12-19 13:00:00 2023-12-19 14:00:00 10
1 2023-12-19 14:00:00 2023-12-19 17:00:00 14
3 2023-12-19 17:00:00 2023-12-19 18:00:00 21
1 2023-12-19 18:00:00 2023-12-19 19:00:00 5

Upvotes: 1

Related Questions