Reputation: 193
I have a table like shown below where I want to use the start and end date to evenly distribute the value for each row to the 3 months in each quarter to all of the quarters in between start and end date (last two columns).
I am familiar with generate series and intervals in Postgres but I am having hard time to get what I want.
My table has and ID
column that groups rows together, a quarter
column that indicates which quarter the row references for the ID
, a value
column that is the value for the whole quarter (and every quarter in the date range), and start_date
and end_date
columns indicating the date range. Here is a sample:
ID quarter value start_date end_date
1 2 152 2019-11-07 2050-12-30
1 1 785 2019-11-07 2050-12-30
2 2 152 2019-03-05 2050-12-30
2 1 785 2019-03-05 2050-12-30
3 4 41 2018-06-12 2050-12-30
3 3 50 2018-06-12 2050-12-30
3 2 88 2018-06-12 2050-12-30
3 1 29 2018-06-12 2050-12-30
4 2 1607 2018-12-17 2050-12-30
4 1 4803 2018-12-17 2050-12-30
Here is my desired output (for ID
1):
ID quarter value start_date end_date
1 2 152/3 2020-04-01 2020-07-01
1 1 785/3 2020-01-01 2020-04-01
1 2 152/3 2021-04-01 2021-07-01
1 1 785/3 2021-01-01 2021-04-01
start_date
in the output will be the next quarter on first table. I need the series to be generated from the start_date
to the end_date
of the first table.
Upvotes: 1
Views: 1820
Reputation: 792
This is one way to go about it. Showing an example based on the output you've outlined. You can then add more conditions to the CASE/WHEN
for additional quarters.
SELECT
ID,
Quarter,
Value/3 AS "Value",
CASE
WHEN Quarter = 1 THEN '2020-01-01'
WHEN Quarter = 2 THEN '2020-04-01'
END AS "Start_Date",
CASE
WHEN Quarter = 1 THEN '2020-04-01'
WHEN Quarter = 2 THEN '2020-07-01'
END AS "End_Date"
FROM
Table
Upvotes: 1
Reputation: 15732
You can do this by using the GENERATE_SERIES
function and passing in the start and end date for each unique (by ID
) row and setting the interval to 3 months. Then join the result back with your original table on both ID
and quarter.
Here's an example (note original_data
is what I've called your first table):
WITH
quarters_table AS (
SELECT
t.ID,
(EXTRACT('month' FROM t.quarter_date) - 1)::INT / 3 + 1 AS quarter,
t.quarter_date::DATE AS start_date,
COALESCE(
LEAD(t.quarter_date) OVER (),
DATE_TRUNC('quarter', t.original_end_date) + INTERVAL '3 months'
)::DATE AS end_date
FROM (
SELECT
original_record.ID,
original_record.end_date AS original_end_date,
GENERATE_SERIES(
DATE_TRUNC('quarter', original_record.start_date),
DATE_TRUNC('quarter', original_record.end_date),
INTERVAL '3 months'
) AS quarter_date
FROM (
SELECT DISTINCT ON (original_data.ID)
original_data.ID,
original_data.start_date,
original_data.end_date
FROM
original_data
ORDER BY
original_data.ID
) AS original_record
) AS t
)
SELECT
quarters_table.ID,
quarters_table.quarter,
original_data.value::DOUBLE PRECISION / 3 AS value,
quarters_table.start_date,
quarters_table.end_date
FROM
quarters_table
INNER JOIN
original_data
ON
quarters_table.ID = original_data.ID
AND quarters_table.quarter = original_data.quarter;
Sample output:
id | quarter | value | start_date | end_date
----+---------+------------------+------------+------------
1 | 1 | 261.666666666667 | 2020-01-01 | 2020-04-01
1 | 2 | 50.6666666666667 | 2020-04-01 | 2020-07-01
1 | 1 | 261.666666666667 | 2021-01-01 | 2021-04-01
1 | 2 | 50.6666666666667 | 2021-04-01 | 2021-07-01
For completeness, here's the original_data
table I've used in testing:
WITH
original_data AS (
SELECT
1 AS ID,
2 AS quarter,
152 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
1 AS ID,
1 AS quarter,
785 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
2 AS ID,
2 AS quarter,
152 AS value,
'2019-03-05'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
-- ...
)
Upvotes: 2