Reputation: 2600
I have a group of machines. They run and stop occasionally. These run times are recorded into a postgres table automatically with timestamps starttime
and endtime
.
I need to find the run time per machine in a 6 hour period. This is what I have so far:
SELECT machine, SUM(EXTRACT(EPOCH FROM (endtime - starttime)))
FROM table
WHERE
starttime >= '2016-01-27 12:00:00'
AND starttime <= '2016-01-27 18:00:00'
GROUP BY machine
ORDER BY machine
So this works, I get the run time in seconds by machine over that time period. BUT it has a flaw - any run times that started before 12:00 do not get counted. And any run times that started in my time frame but don't end until after it have time counted that shouldn't be there.
Is there a solution to ONLY extract the time that is inside the time frame? My initial thought would be to select all rows where:
endtime >= '2016-01-27 12:00:00'
and somehow, in memory, set all the start times at '2016-01-27 12:00:00'
where the start time is earlier than that and:
starttime <='2016-01-27 18:00:00'
and, again in memory without updating the database, set all end times to '2016-01-27 18:00:00'
where the end time is later than that. and then run the extraction/summation query.
But I'm struggling on how to implement something like this. I have a working solution in Java/Python this data is returned to, but they are iterative and take more time than I'd like. I'd really like to find an SQL solution to this if possible.
Edit: To clarify, I need to count ALL run time that occurs within the time frame - if a run starts before the time frame only the portion of that run that occurs after the time frame should be counted.
Upvotes: 0
Views: 198
Reputation: 40491
EDIT : this is what you need.
When it start between the range, and ends after, it will calc it as ended at 18:00:00.
Same goes for when ended between the range and start before, it will calc as it started at 12:00:00
SELECT machine, SUM(EXTRACT(EPOCH FROM (endtime - starttime)))
FROM (SELECT machine,
case when starttime <= '2016-01-27 12:00:00' then '2016-01-27 12:00:00' else starttime end as starttime,
case when endtime>= '2016-01-27 18:00:00' then '2016-01-27 18:00:00' else endtime end as endtime, FROM table
WHERE
(endtime>= '2016-01-27 12:00:00'
AND endtime <= '2016-01-27 18:00:00')
OR (starttime>= '2016-01-27 12:00:00'
AND starttime<= '2016-01-27 18:00:00')
GROUP BY machine
ORDER BY machine
Upvotes: 1
Reputation:
You can use the overlaps operator:
SELECT machine, SUM(EXTRACT(EPOCH FROM (endtime - starttime)))
FROM table
where (starttime, endtime) overlaps (timestamp '2016-01-27 12:00:00', timestamp '2016-01-27 18:00:00')
GROUP BY machine
ORDER BY machine
Upvotes: 1
Reputation: 121889
Use the type tsrange and its intersection operator.
Example data:
create table machines (machine int, starttime timestamp, endtime timestamp);
insert into machines values
(1, '2016-01-27 10:00:00', '2016-01-27 14:00:00'),
(2, '2016-01-27 15:00:00', '2016-01-27 16:00:00'),
(3, '2016-01-27 17:00:00', '2016-01-27 20:00:00');
This query returns time frames rounded to the given period:
select
machine,
tsrange(starttime, endtime)*
'[2016-01-27 12:00:00, 2016-01-27 18:00:00)'::tsrange t
from machines;
machine | t
---------+-----------------------------------------------
1 | ["2016-01-27 12:00:00","2016-01-27 14:00:00")
2 | ["2016-01-27 15:00:00","2016-01-27 16:00:00")
3 | ["2016-01-27 17:00:00","2016-01-27 18:00:00")
(3 rows)
Get the aggregate from the above dataset:
select
machine,
sum(extract(epoch from (upper(t) - lower(t))))
from (
select
machine,
tsrange(starttime, endtime)*
'[2016-01-27 12:00:00, 2016-01-27 18:00:00)'::tsrange t
from machines
) sub
group by 1
order by 1;
machine | sum
---------+------
1 | 7200
2 | 3600
3 | 3600
(3 rows)
Upvotes: 1