Thomas
Thomas

Reputation: 45

SQL - Convert Time Series Events into On/Off Pairs (handling potential missing On's or Off's)

In SQL Server, I have a set of time series on/off events which look like this (for simplicity I have only shown one Alarm Number, but there are many in the same table):

'Alarms' Table:
AlarmNumber   Time                      AlarmState
1592          2020-01-02 01:52:02       1
1592          2020-01-02 01:58:07       0
1592          2020-04-28 03:46:49       1
1592          2020-04-28 06:19:10       0
1592          2020-06-04 00:25:22       1
1592          2020-08-27 01:57:03       1
1592          2020-08-27 05:16:32       0
1592          2020-09-17 02:51:57       0

Which I am trying convert into On/Off Pairs:

Output I am trying to achieve, ideally as an SQL View:
AlarmNumber   StartTime                 EndTime
1592          2020-01-02 01:52:02       2020-01-02 01:58:07
1592          2020-04-28 03:46:49       2020-04-28 06:19:10
1592          2020-06-04 00:25:22       NULL
1592          2020-08-27 01:57:03       2020-08-27 05:16:32
1592          NULL                      2020-09-17 02:51:57

If I had a clean data set, with no missing 'On' or 'Off' events, I can achieve this with:

select tOn.AlarmNumber, tOn.Time StartTime, tOff.Time EndTime
from (
select AlarmNumber, Time, 
       ROW_NUMBER() Over(Partition by AlarmNumber order by Time) EventID
from Alarms where AlarmState = 1
) tOn
LEFT JOIN (
select AlarmNumber, Time, 
       ROW_NUMBER() Over(Partition by AlarmNumber order by Time) EventID
from Alarms where AlarmState = 0
) tOff
on (tOn.AlarmNumber = tOff.AlarmNumber and tOn.EventID = tOff.EventID)

(Code modified from Adriano Carneiro's answer at T-SQL Start and end date times from a single column )

My Question: Can anyone think of an efficient way of processing 'Alarms' table to achieve my sample Output, which handles missing on/off events (shown as the NULL's in the sample Output)?

My backup is to use a Cursor and a Where loop, but I was hoping there may be a way to doing it by grouping pairs of On/Off events together, I just haven't been able to get it working. I have 500k+ of events, so it is a large data set to be iterating over.

Any ideas welcome!

Thank you, Thomas

------ UPDATE 1st Nov 2020 ------

Two great solutions have been provided and both work correctly and provide identical results on a sample of 80,000 rows of messy real world data.

Upvotes: 4

Views: 448

Answers (2)

gotqn
gotqn

Reputation: 43656

Once having the rows order, just SELECT them in parts and unite the results using UNION ALL:

DECLARE @DataSource TABLE
(
    [AlarmNumber] INT
   ,[Time] DATETIME2(0)
   ,[AlarmState] INT
);

INSERT INTO @DataSource ([AlarmNumber], [Time], [AlarmState])
VALUES (1592, '2020-01-02 01:52:02', 1)
      ,(1592, '2020-01-02 01:58:07', 0)
      ,(1592, '2020-04-28 03:46:49', 1)
      ,(1592, '2020-04-28 06:19:10', 0)
      ,(1592, '2020-06-04 00:25:22', 1)
      ,(1592, '2020-08-27 01:57:03', 1)
      ,(1592, '2020-08-27 05:16:32', 0)
      ,(1592, '2020-09-17 02:51:57', 0);

-- Add a rowID column to the data
WITH DataSource AS
(
    SELECT * ,ROW_NUMBER() Over(Partition by AlarmNumber order by [Time]) rowID
    FROM @DataSource
)

-- This is just here so we can sort the result at the end
SELECT * FROM (

-- Select rows of DataSource where there is an ON and subsequent OFF event (DS1 Alarm is ON and DS2 Alarm is OFF)
-- This also catches where there is an ON, but no subsequent OFF (DS2.Time will be NULL)
    SELECT DS1.AlarmNumber
            ,DS1.Time As StartTime
            ,DS2.Time As EndTime
    FROM DataSource DS1
    LEFT JOIN DataSource DS2
        ON DS1.[rowID] = DS2.[rowID] - 1
        AND DS1.AlarmNumber = DS2.AlarmNumber
        AND DS2.[AlarmState] = 0
    WHERE DS1.[AlarmState] = 1

    UNION ALL

    -- Select rows of DataSource where there is an OFF and there is no matching ON (aka it turned OFF without ever turning ON)
    SELECT DS2.AlarmNumber
            ,NULL As StartTime
            ,DS2.Time As EndTime
    FROM DataSource DS2

    INNER JOIN DataSource DS1
        ON DS2.[rowID] -1 = DS1.[rowID]
        AND DS1.[AlarmState] = 0
        AND DS2.AlarmNumber = DS1.AlarmNumber
    
    WHERE DS2.[AlarmState] = 0

    UNION ALL

    -- Select rows of DataSource where the first event for this alarm number is an OFF (it would otherwise be missed by the above)
    SELECT DS1.AlarmNumber
            ,NULL As StartTime
            ,DS1.Time As EndTime
    FROM DataSource DS1
    WHERE DS1.[AlarmState] = 0 AND DS1.rowID = 1
) z
ORDER BY COALESCE(StartTime,EndTime), AlarmNumber

enter image description here

Upvotes: 2

GMB
GMB

Reputation: 222622

A group is made of two consecutive rows, where the first has status 1 and the second has status 0. I would approach this with window functions, like this:

select 
    alarmnumber,
    max(case when alarmstate = 1 then time end) start_time,
    max(case when alarmstate = 0 then time end) end_time
from (
    select a.*, 
        sum(case when alarmstate = 0 and lag_alarmstate = 1 then 0 else 1 end)
            over(partition by alarmnumber order by time) grp
    from (
        select a.*, 
            lag(alarmstate) over(partition by alarmnumber order by time) lag_alarmstate
        from alarms a
    ) a
) a
group by alarmnumber, grp

This uses lag() to retrieve the "previous" state, and a cumulative sum to define the groups. The last step is conditional aggregation.

Demo on DB Fiddle:

alarmnumber | start_time              | end_time               
:---------- | :---------------------- | :----------------------
1592        | 2020-01-02 01:52:02.000 | 2020-01-02 01:58:07.000
1592        | 2020-04-28 03:46:49.000 | 2020-04-28 06:19:10.000
1592        | 2020-06-04 00:25:22.000 | null                   
1592        | 2020-08-27 01:57:03.000 | 2020-08-27 05:16:32.000
1592        | null                    | 2020-09-17 02:51:57.000

Upvotes: 3

Related Questions