Arcane
Arcane

Reputation: 1228

Finding runs of a particular value

I have a table in Oracle 10 that is defined like this:

LOCATION   HOUR              STATUS
--------------------------------------
10         12/10/09 5:00PM      1
10         12/10/09 6:00PM      1
10         12/10/09 7:00PM      2
10         12/10/09 8:00PM      1
10         12/10/09 9:00PM      3
10         12/10/09 10:00PM     3
10         12/10/09 11:00PM     3

This table continues for various locations and for a small number of status values. Each row covers one hour for one location. Data is collected from a particular location over the course of that hour, and processed in chunks. Sometimes the data is available, sometimes it isn't, and that information is encoded in the status. I am trying to find runs of a particular status, so that I could convert the above table into something like:

LOCATION   STATUS     START               END
-----------------------------------------------------------
10         1          12/10/09 5:00PM     12/10/09 7:00PM
10         2          12/10/09 7:00PM     12/10/09 8:00PM  
10         1          12/10/09 8:00PM     12/10/09 9:00PM
10         3          12/10/09 9:00PM     12/11/09 12:00AM 

Basically condensing the table into rows that define each stretch of a particular status. I have tried various tricks, like using lead/lag to figure out where starts and ends are and such, but all of them have met with failure. The only trick that works so far is going one by one through the values programatically, which is slow. Any ideas for doing it directly in Oracle? Thanks!

Upvotes: 1

Views: 114

Answers (3)

Roland Bouman
Roland Bouman

Reputation: 31981

Here's an ANSI SQL solution:

select      t1.location
,           t1.status
,           min(t1.hour)                                      AS "start" -- first of stretch of same status
,           coalesce(t2.hour, max(t1.hour) + INTERVAL '1' HOUR) AS "end"
from        t_intervals t1             -- base table, this is what we are condensing
left join   t_intervals t2             -- finding the first datetime after a stretch of t1
on          t1.location = t2.location  -- demand same location
and         t1.hour     < t2.hour      -- demand t1 before t2
and         t1.status  != t2.status    -- demand different status 
left join   t_intervals t3             -- finding rows not like t1, with hour between t1 and t2
on          t1.location = t3.location
and         t1.status  != t3.status
and         t1.hour     < t3.hour
and         t2.hour     > t3.hour
where       t3.status is null          -- demand that t3 does not exist, in other words, t2 marks a status transition
group by    t1.location                -- condense on location, status.
,           t1.status
,           t2.hour                    -- this pins the status transition
order by    t1.location
,           t1.status
,           min(t1.hour)

Upvotes: 2

DVK
DVK

Reputation: 129491

OK, I apologize for not knowing Oracle syntax, but I hope that the below Sybase one is clear enough (I split it into 3 queries creating 2 temp tables for readbility but you can just re-unit as sub-queries. I don't know how to add/subtract 1 hour in Oracle, dateadd(hh...) does it in Sybase

SELECT * FROM T
INTO   #START_OF_PERIODS
WHERE NOT EXISTS (
    SELECT 1 FROM T_BEFORE
    WHERE T.LOCATION = T_BEFORE.LOCATION
      AND T.STATUS   = T_BEFORE.STATUS
      AND T.HOUR     = dateadd(hh, T_BEFORE.HOUR, 1)
   )

SELECT * FROM T
INTO   #END_OF_PERIODS
WHERE NOT EXISTS (
    SELECT 1 FROM T_AFTER
    WHERE T.LOCATION = T_AFTER.LOCATION
      AND T.STATUS   = T_AFTER.STATUS
      AND T.HOUR     = dateadd(hh, T_AFTER.HOUR, -1)
   )

SELECT T1.LOCATION, T1.STATUS, T1.HOUR AS 'START', MIN(T2.HOUR) AS 'END'
FROM   #START_OF_PERIODS 'T1', #END_OF_PERIODS 'T2'
WHERE  T1.LOCATION = T2.LOCATION
  AND  T1.STATUS   = T2.STATUS
  AND  T1.HOUR    <= T2.HOUR
GROUP BY T1.LOCATION, T1.STATUS, T1.HOUR 
    -- May need to add T2.LOCATION, T2.STATUS to GROUP BY???

Upvotes: 1

Peter Schuetze
Peter Schuetze

Reputation: 16305

Ever thought about a stored procedure? I think that would be the most readable solution.

Basic Idea:

  1. run a select statement that gives you the rown in the right order for one building
  2. iterate over the result line by line and write a new 'run'-record every time the status changes and when reaching the end of the result set.

You need to test if it is also the fastest way. Depending on the number of records, this might not be an issue at all.

Upvotes: 0

Related Questions