Reputation: 101
I am working with data where I need to use a calculation that is supposed to reference a previously calculated value in the previous row.
For example, take this dataset:
SELECT
generate_series('2015-01-01', '2019-12-01', '1 month'::interval)::date AS dates,
generate_series(1,60) AS nums;
There's are NULL
values starting at 2019-03-01
.
I'd like to write a calculation on another column that fills it in based off the previous row, that is derived from that same calculation. So I tried to use some lag()
functions. But after a while it turns to NULL
, probably because the previous calculation is also null.
with
mynumbers AS (
SELECT
generate_series('2015-01-01', '2025-12-01', '1 month'::interval)::date AS dates,
generate_series(1,50) AS nums),
mynumbers_lag AS (
SELECT *, lag(nums) OVER (ORDER BY dates ASC) AS previous1
FROM mynumbers)
SELECT dates, nums, previous1, (coalesce(nums,previous1)+lag(coalesce(nums,previous1), 5) OVER (ORDER BY dates ASC))*4 AS moving_calculation FROM mynumbers_lag;
The result starts to deviate from what I'd like it to be at 2019-03-01
. I'd like my calculation to continue all the way through the table. Anyone know how I can accomplish this?
Edit: borrowing unutbu's table.. I want to yield this:
| dates | nums | previous1 | moving_calculation |
|------------+------+-----------+--------------------|
| 2015-01-01 | 1 | | |
| 2015-02-01 | 2 | 1 | |
| 2015-03-01 | 3 | 2 | |
| 2015-04-01 | 4 | 3 | |
| 2015-05-01 | 5 | 4 | |
| 2015-06-01 | 6 | 5 | 28 |
| 2015-07-01 | 7 | 6 | 36 |
| 2015-08-01 | 8 | 7 | 44 |
| 2015-09-01 | 9 | 8 | 52 |
| 2015-10-01 | 10 | 9 | 60 |
...
| 2018-12-01 | 50 | 49 | 364 |
| 2019-01-01 | 50 | 49 | 372 |
| 2019-02-01 | 50 | 49 | 380 |
| 2019-03-01 | 50 | 49 | 388 |
| 2019-04-01 | 50 | 49 | 1744 |
| 2019-05-01 | 50 | 49 | 7172 |
| 2019-06-01 | | | 28888 |
| 2019-07-01 | | | 117104 |
| 2019-08-01 | | | 475392 |
| 2019-09-01 | | | 1930256 |
On 2019-04-01
, the 1744
is calculated from (388+48)*4
. The 388
is one
cell from the previously calculated value because nums
is NULL. Eventually,
starting on 2018-07-01
, both nums
are NULL, so it will calculating using
only from moving_calculations
(values 380
and 7172
)
Upvotes: 0
Views: 186
Reputation: 879321
The values in the moving_calculation
column (denoted m0
below) depend on
prior values in the same column. They are defined by a recurrence
relation. There might even be a closed-form formula for m0
. You might want to
ask a question on Mathematics stackexchange if you wish to find a closed-form
formula. If we knew the closed-form formula, clearly computing values in
Postgresql would be a breeze.
However, if we regard this problem as a programming problem, then I believe the calculation -- if it is to be done in Postgresql -- is most easily expressed using WITH RECURSIVE. The calculation sort of feels like the calculation of Fibonacci numbers.
WITH RECURSIVE r(a, b) AS (
SELECT 0::int, 1::int
UNION ALL
SELECT b, a + b FROM r WHERE b < 50
)
SELECT a, b FROM r;
yields
| a | b |
|----+----|
| 0 | 1 |
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 5 |
| 5 | 8 |
| 8 | 13 |
| 13 | 21 |
| 21 | 34 |
| 34 | 55 |
If you understand the use of WITH RECURSIVE
in that Fibonacci example, then I believe you'll see the solution below
is merely an extension of the same idea.
WITH RECURSIVE r(dates, nums, prev, m0, m1, m2, m3) AS (
SELECT * FROM (VALUES ('2019-02-01'::date, 50::numeric, 47::numeric, 388::numeric, NULL::numeric, NULL::numeric, NULL::numeric)) AS t1
UNION ALL
SELECT (dates + '1 month'::interval)::date
, m0
, coalesce(m3, prev+1)
, (m0+coalesce(m3, prev+1))*4
, m0
, m1
, m2
FROM r
WHERE dates <= '2020-01-01'
)
SELECT * FROM r
yields
| dates | nums | prev | m0 | m1 | m2 | m3 |
|------------+------------+----------+------------+------------+-----------+-----------|
| 2019-02-01 | 50 | 47 | 388 | | | |
| 2019-03-01 | 388 | 48 | 1744 | 388 | | |
| 2019-04-01 | 1744 | 49 | 7172 | 1744 | 388 | |
| 2019-05-01 | 7172 | 50 | 28888 | 7172 | 1744 | 388 |
| 2019-06-01 | 28888 | 388 | 117104 | 28888 | 7172 | 1744 |
| 2019-07-01 | 117104 | 1744 | 475392 | 117104 | 28888 | 7172 |
| 2019-08-01 | 475392 | 7172 | 1930256 | 475392 | 117104 | 28888 |
| 2019-09-01 | 1930256 | 28888 | 7836576 | 1930256 | 475392 | 117104 |
| 2019-10-01 | 7836576 | 117104 | 31814720 | 7836576 | 1930256 | 475392 |
| 2019-11-01 | 31814720 | 475392 | 129160448 | 31814720 | 7836576 | 1930256 |
| 2019-12-01 | 129160448 | 1930256 | 524362816 | 129160448 | 31814720 | 7836576 |
| 2020-01-01 | 524362816 | 7836576 | 2128797568 | 524362816 | 129160448 | 31814720 |
| 2020-02-01 | 2128797568 | 31814720 | 8642449152 | 2128797568 | 524362816 | 129160448 |
To combine this table with the original table, use UNION
:
WITH mytable AS (
SELECT *, (nums+prev)*4 AS m0, NULL::numeric AS m1, NULL::numeric AS m2, NULL::numeric AS m3
FROM (
SELECT *
, lag(nums, 3) OVER (ORDER BY dates ASC) AS prev
FROM (
SELECT
generate_series('2015-01-01', '2025-12-01', '1 month'::interval)::date AS dates,
generate_series(1,50)::numeric AS nums) t
) t2
WHERE nums IS NOT NULL
), last_row AS (
SELECT * FROM mytable
WHERE nums IS NOT NULL
ORDER BY dates DESC
LIMIT 1
)
SELECT * FROM mytable
UNION (
WITH RECURSIVE r(dates, nums, prev, m0, m1, m2, m3) AS (
SELECT * FROM last_row
UNION ALL
SELECT (dates + '1 month'::interval)::date
, m0
, coalesce(m3, prev+1)
, (m0+coalesce(m3, prev+1))*4
, m0
, m1
, m2
FROM r
WHERE dates <= '2020-01-01')
SELECT * FROM r)
ORDER BY dates
which yields
| dates | nums | prev | m0 | m1 | m2 | m3 |
|------------+------------+----------+------------+------------+-----------+-----------|
| 2015-01-01 | 1 | | | | | |
| 2015-02-01 | 2 | | | | | |
| 2015-03-01 | 3 | | | | | |
| 2015-04-01 | 4 | 1 | 20 | | | |
| 2015-05-01 | 5 | 2 | 28 | | | |
| 2015-06-01 | 6 | 3 | 36 | | | |
| 2015-07-01 | 7 | 4 | 44 | | | |
| 2015-08-01 | 8 | 5 | 52 | | | |
| 2015-09-01 | 9 | 6 | 60 | | | |
| 2015-10-01 | 10 | 7 | 68 | | | |
| 2015-11-01 | 11 | 8 | 76 | | | |
| 2015-12-01 | 12 | 9 | 84 | | | |
| 2016-01-01 | 13 | 10 | 92 | | | |
| 2016-02-01 | 14 | 11 | 100 | | | |
| 2016-03-01 | 15 | 12 | 108 | | | |
| 2016-04-01 | 16 | 13 | 116 | | | |
| 2016-05-01 | 17 | 14 | 124 | | | |
| 2016-06-01 | 18 | 15 | 132 | | | |
| 2016-07-01 | 19 | 16 | 140 | | | |
| 2016-08-01 | 20 | 17 | 148 | | | |
| 2016-09-01 | 21 | 18 | 156 | | | |
| 2016-10-01 | 22 | 19 | 164 | | | |
| 2016-11-01 | 23 | 20 | 172 | | | |
| 2016-12-01 | 24 | 21 | 180 | | | |
| 2017-01-01 | 25 | 22 | 188 | | | |
| 2017-02-01 | 26 | 23 | 196 | | | |
| 2017-03-01 | 27 | 24 | 204 | | | |
| 2017-04-01 | 28 | 25 | 212 | | | |
| 2017-05-01 | 29 | 26 | 220 | | | |
| 2017-06-01 | 30 | 27 | 228 | | | |
| 2017-07-01 | 31 | 28 | 236 | | | |
| 2017-08-01 | 32 | 29 | 244 | | | |
| 2017-09-01 | 33 | 30 | 252 | | | |
| 2017-10-01 | 34 | 31 | 260 | | | |
| 2017-11-01 | 35 | 32 | 268 | | | |
| 2017-12-01 | 36 | 33 | 276 | | | |
| 2018-01-01 | 37 | 34 | 284 | | | |
| 2018-02-01 | 38 | 35 | 292 | | | |
| 2018-03-01 | 39 | 36 | 300 | | | |
| 2018-04-01 | 40 | 37 | 308 | | | |
| 2018-05-01 | 41 | 38 | 316 | | | |
| 2018-06-01 | 42 | 39 | 324 | | | |
| 2018-07-01 | 43 | 40 | 332 | | | |
| 2018-08-01 | 44 | 41 | 340 | | | |
| 2018-09-01 | 45 | 42 | 348 | | | |
| 2018-10-01 | 46 | 43 | 356 | | | |
| 2018-11-01 | 47 | 44 | 364 | | | |
| 2018-12-01 | 48 | 45 | 372 | | | |
| 2019-01-01 | 49 | 46 | 380 | | | |
| 2019-02-01 | 50 | 47 | 388 | | | |
| 2019-03-01 | 388 | 48 | 1744 | 388 | | |
| 2019-04-01 | 1744 | 49 | 7172 | 1744 | 388 | |
| 2019-05-01 | 7172 | 50 | 28888 | 7172 | 1744 | 388 |
| 2019-06-01 | 28888 | 388 | 117104 | 28888 | 7172 | 1744 |
| 2019-07-01 | 117104 | 1744 | 475392 | 117104 | 28888 | 7172 |
| 2019-08-01 | 475392 | 7172 | 1930256 | 475392 | 117104 | 28888 |
| 2019-09-01 | 1930256 | 28888 | 7836576 | 1930256 | 475392 | 117104 |
| 2019-10-01 | 7836576 | 117104 | 31814720 | 7836576 | 1930256 | 475392 |
| 2019-11-01 | 31814720 | 475392 | 129160448 | 31814720 | 7836576 | 1930256 |
| 2019-12-01 | 129160448 | 1930256 | 524362816 | 129160448 | 31814720 | 7836576 |
| 2020-01-01 | 524362816 | 7836576 | 2128797568 | 524362816 | 129160448 | 31814720 |
| 2020-02-01 | 2128797568 | 31814720 | 8642449152 | 2128797568 | 524362816 | 129160448 |
Upvotes: 2