Reputation: 23
I have the following problem...
Time | A | B | C -- Sum should be
1 a1 b1 c1 a1 + b1 + c1
2 a2 b2 x a2 + b1 + c1
3 a3 x x a3 + b2 + c1
4 x b3 c2 a3 + b3 + c2
Essentially, the sum needs to be across the most recent value in time for each of the three rows. Each data column doesn't necessarily have a value for the current time.
I have tried several approaches using window functions and have been unsuccessful. I have written a stored procedure that does what I need, but it is SLOW.
CREATE OR REPLACE FUNCTION timeseries.combine_series(id int[], startTime timestamp, endTime timestamp)
RETURNS setof RECORD AS $$
DECLARE
retval double precision = 0;
row_data timeseries.total_active_energy%ROWTYPE;
maxCount integer = 0;
sz integer = 0;
lastVal double precision[];
v_rec RECORD;
BEGIN
SELECT INTO sz array_length($1,1);
FOR row_data IN SELECT * FROM timeseries.total_active_energy WHERE time >= startTime AND time < endTime AND device_id = ANY($1) ORDER BY time
LOOP
retval = row_data.active_power;
for i IN 1..sz LOOP
IF $1[i]=row_data.device_id THEN
lastVal[i] = row_data.active_power;
ELSE
retval = retVal + COALESCE(lastVal[i],0);
END IF;
END LOOP;
SELECT row_data.time, retval into v_rec;
return next v_rec;
END LOOP;
return ;
END;
$$ LANGUAGE plpgsql;
Call:
select * from timeseries.combine_series('{552,553,554}'::int[], '2013-05-01'::timestamp, '2013-05-02'::timestamp)
AS (t timestamp with time zone, val double precision);
Upvotes: 1
Views: 785
Reputation: 656221
SELECT ts, a, b, c
, COALESCE(max(a) OVER (PARTITION BY grp_a), 0)
+ COALESCE(max(b) OVER (PARTITION BY grp_b), 0)
+ COALESCE(max(c) OVER (PARTITION BY grp_c), 0) AS special_sum
FROM (
SELECT *
,count(a) OVER w AS grp_a
,count(b) OVER w AS grp_b
,count(c) OVER w AS grp_c
FROM t
WINDOW w AS (ORDER BY ts)
) sub
ORDER BY ts;
First, put actual values and following NULL
values in a group with the aggregate window function count()
: it does not increment with NULL
values.
Then take max()
from every group, arriving at what you are looking for. At this point you could just as well use min()
or sum()
, since there is only one non-null value per group.
COALESCE()
catches NULL
values if the overall first value in time is NULL
.
Note how I picked ts
as column name, since I don't use base type names like time
as identifiers.
That's also how you everyone should provide sample data in the first place!
CREATE TEMP TABLE t (ts int, a int, b int, c int);
INSERT INTO t VALUES
(1, 11, 21, NULL)
,(2, 12, 22, NULL)
,(3, 13, NULL, NULL)
,(4, NULL, 23, 32);
Upvotes: 1