Reputation: 23
I am using PostgreSQL and I have a table that tracks when a system is up or down. There are three columns: interval_date
with datatype DATE
, interval_time_utc
with datatype VARCHAR
, and status
with datatype INTEGER
. A status of 0 is up and 1 is down. I am trying to calculate the average uptime on a daily basis. Below is a table example.
interval_date | interval_time_utc | status |
---|---|---|
2022-07-08 | 04:00 | 0 |
2022-07-07 | 20:00 | 1 |
2022-07-07 | 04:00 | 0 |
2022-07-06 | 17:45 | 1 |
2022-07-06 | 01:15 | 0 |
2022-07-05 | 18:40 | 1 |
So based on the data below:
System was up from 2022-07-05 00:00 to 2022-07-05 18:39 then down from 2022-07-05 18:40 to 2022-07-05 23:59.
System was down from 2022-07-06 00:00 to 2022-07-06 01:14 then the system was up from 2022-07-06 01:15 to 2022-07-06 17:44 then the system was down from 2022-07-06 17:45 to 2022-07-06 23:59.
System was down from 2022-07-07 00:00 to 2022-07-07 03:59 then the system was up from 2022-07-07 04:00 to 2022-07-07 19:59 then the system was down from 2022-07-07 20:00 to 2022-07-07 23:59.
System was down from 2022-07-08 00:00 to 2022-07-08 03:59 then the system was up from 2022-07-08 04:00 to the current UTC time.
Link to fiddle containing schema: https://www.db-fiddle.com/f/v7RCBBNowhv7DmfBoor64X/0
I would expect the following results (assuming my calculations are correct):
interval_date | average_uptime |
---|---|
2022-07-05 | 0.78 |
2022-07-06 | 0.69 |
2022-07-07 | 0.69 |
2022-07-08 | 0.83 |
Upvotes: 0
Views: 174
Reputation: 2746
Here is my take on it - get daily down-time and subtract from 24 hours, except for today, where the subtraction is from current utc time.
select interval_date,
case when interval_date = current_date then
round(cast((case when sum(int_time) < 0 then trunc(EXTRACT(EPOCH FROM current_time at time zone 'utc')/60 ) - sum(int_time)
else sum(int_time) end)/trunc(EXTRACT(EPOCH FROM current_time at time zone 'utc')/60 ) as numeric), 2)
else
round(cast((case when sum(int_time) < 0 then 1440 + sum(int_time)
else sum(int_time) end)/1440 as numeric), 2)
end up_time
from
(
select interval_date, interval_time_utc,
case when status = 0 then (-1) * EXTRACT(EPOCH FROM interval_time_utc::time)/60
else EXTRACT(EPOCH FROM interval_time_utc::time)/60 - 1 end int_time,
status
from system_power_history
) t
group by interval_date
order by interval_date;
Fiddle here.
Upvotes: 0
Reputation: 658767
Assuming:
A current version of Postgres.
A UNIQUE
constraint on the timestamp.
The leading fraction of the first day that is not covered by actual entry is the opposite status of the first entry.
Other than that, every row is allowed to switch on or off (redundantly).
The result shall contain all days between the first and last entry - including days without any change (all up or all down).
And I converted the table to a sane schema before working with it:
CREATE TABLE fixed AS
SELECT interval_date + interval_time_utc::time AS interval_ts_utc, NOT status::bool AS up
FROM system_power_history;
Then this does the job:
SELECT the_day, round(extract('epoch' FROM sum(uptime)) / 86400, 2) AS avg_uptime
FROM (
-- add start interval for 1st day (if up)
SELECT interval_ts_utc::date AS the_day, interval_ts_utc::time::interval AS uptime
FROM (SELECT * FROM fixed ORDER BY interval_ts_utc LIMIT 1) sub
WHERE NOT up
UNION ALL
SELECT the_day, upper(range) - lower(range) AS uptime
FROM (
SELECT the_day::date
, tsrange(the_day, the_day + interval '1 day') * tsrange(ts, next_ts) AS range -- intersect with day
FROM (
SELECT interval_ts_utc AS ts
, lead(interval_ts_utc, 1, date_trunc('day', interval_ts_utc) + interval '1 day') -- default to start of next day
OVER (ORDER BY interval_ts_utc) AS next_ts
, up
FROM fixed
) sub
CROSS JOIN LATERAL generate_series(date_trunc('day', ts)
, next_ts - interval '1 us'
, interval '1 day') the_day -- get 1 row per intersecting day
WHERE up
) sub
) sub2
GROUP BY 1
ORDER BY 1;
db<>fiddle here
Using the range intersection operator *
for convenience.
Except that days with no uptime at all are missing in the result. Could be amended, but rather consider the next solution.
The nature of the problem lends itself to a procedural solution. So this is one of the rare cases where a loop over the set beats the performance of a set-based solution:
Create function once:
CREATE OR REPLACE FUNCTION f_daily_uptimes()
RETURNS TABLE (the_day date, uptime interval)
LANGUAGE plpgsql STABLE AS
$func$
DECLARE
_ts timestamp;
_ts0 timestamp;
_up bool;
_up0 bool;
BEGIN
FOR _ts, _up IN
SELECT interval_ts_utc, up FROM fixed ORDER BY interval_ts_utc
LOOP
-- only for 1st iteration
IF the_day IS NULL THEN
the_day := _ts::date;
_ts0 := the_day;
_up0 = NOT _up;
END IF;
LOOP
IF _up0 THEN
uptime := LEAST(_ts, the_day + 1) - GREATEST(_ts0, the_day);
RETURN NEXT;
ELSIF the_day > _ts0::date AND _ts::date > the_day THEN -- empty days
uptime := '0:0';
RETURN NEXT;
END IF;
EXIT WHEN the_day >= _ts::date;
the_day = the_day + 1;
END LOOP;
_ts0 := _ts; -- remember last row
_up0 := _up; -- remember last row
END LOOP;
-- final interval
IF _up THEN
uptime := (the_day + 1) - _ts0;
RETURN NEXT;
END IF;
END
$func$;
Call:
SELECT the_day, round(extract('epoch' FROM sum(uptime)) / 86400, 2) AS avg_uptime
FROM f_daily_uptimes()
GROUP BY 1
ORDER BY 1;
db<>fiddle here
Also includes days with no uptime at all - which are yet missing in my pure-SQL solution above.
Related:
Upvotes: 2