Atique
Atique

Reputation: 3

SQL Query based on complex business logic

I have a table with structure:

id, timestamp, deviceId, datatype, measure

The value of column measure represents value of the datatype. For example, when processing starts then datatype is 19 and measure 1. When it is completed then datatype is still 19 with value 0 and a new row inserted with same timestamp, datatype 54 and value as some value. This means on completion the system is calling some trigger to update this table. Example data below

1001, 2013-01-02 09:20:00, 501, 19, 1
1005, 2013-01-02 10:00:00, 501, 19, 0
1006, 2013-01-02 10:00:00, 501, 54, 65

Timestamp of 1005 & 1006 is same, timestamp of 1001 is always less than that of 1005

1011, 2013-01-02 09:20:00, 601, 19, 1
1015, 2013-01-02 10:00:00, 601, 19, 0
1016, 2013-01-02 10:00:00, 601, 54, 105

Timestamp of 1015 & 1016 is same, timestamp of 1011 is always less than that of 1015

1021, 2013-01-02 09:20:00, 701, 19, 1
1022, 2013-01-02 10:00:00, 701, 19, 0
1023, 2013-01-02 10:00:00, 701, 54, 81

Timestamp of 1022 & 1023 is same, timestamp of 1021 is always less than that of 1022

This same process can be happening simultaneously for multiple devices.

Now the requirement is to find the start and end time for each completed transaction like

1006, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 501, 65
1016, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 601, 105
1023, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 701, 81

I am writing SQL queries after some 5 years and completely stuck. Any pointers/suggestions will be highly appreciated.

Thanks in advance

Upvotes: 0

Views: 412

Answers (3)

GarethD
GarethD

Reputation: 69789

It is possible I am massively over simplifying the issue here, but I don't see any reason why for each record with a datatype 54 you can't just access the previous record for that device where the datatype is 19 and the measure is 1:

SELECT  result.ID, 
        result.DeviceID, 
        MAX(start.Timestamp) StartTime, 
        result.Timestamp EndTime, 
        result.Measure
FROM    T result
        INNER JOIN T start
            ON start.DeviceID = result.DeviceID
            AND start.Timestamp < result.Timestamp
            AND start.DataType = 19
            AND start.Measure = 1
WHERE   result.DataType = 54
GROUP BY result.ID, result.DeviceID, result.Timestamp, result.Measure

The only real difference is that rather than trying to solve the problem by starting at the beginning and working forward to the result, i have started with the result and worked backwards to the start. This will fail if the process runs simultaneously for the same device (i.e one transaction starts before the previous one has ended)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270623

My logic for this is a simple aggregation. However, the aggregation key is the "next" record that has datatype 54, with the same device id.

To get this next record, I'm using a correlated subquery in the where clause:

select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
       MAX(case when id = next54 then measure end)
from (select t.*,
             (select MIN(id) from t t2 where t2.id >= t.id and t2.datatype = 54 and t2.device_id = t.device_id) as next54
      from t
     ) t
group by next54

The rest is aggregation.

Because I personally am not a big fan of correlated subqueries, you can also write this using the window functions (sometimes called analytic functions in Oracle):

select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
       MAX(case when id = next54 then measure end)
from (select t.*,
             min(id54) over (partition by device_id order by id desc) as next54
       from (select t.*,
                    (case when datatype = 54 then id end) as id54
             from t
            ) t
     ) t
group by next54

The min function with an order by clause does a "cumulative" minimum. The result should be the same as the correlated subquery.

Upvotes: 0

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125444

SQL Fiddle

CREATE TABLE t
        (id int, ts timestamp, deviceId int, datatype int, measure int)
;

INSERT INTO t
        (id, ts, deviceId, datatype, measure)
VALUES
        (1001, '2013-01-02 09:20:00', 501, 19, 1),
        (1005, '2013-01-02 10:00:00', 501, 19, 0),
        (1006, '2013-01-02 10:00:00', 501, 54, 65),
        (1007, '2013-01-02 10:20:00', 501, 19, 1),
        (1008, '2013-01-02 11:00:00', 501, 19, 0),
        (1009, '2013-01-02 11:00:00', 501, 54, 65),
        (1011, '2013-01-02 09:20:00', 601, 19, 1),
        (1015, '2013-01-02 10:00:00', 601, 19, 0),
        (1016, '2013-01-02 10:00:00', 601, 54, 105),
        (1021, '2013-01-02 09:20:00', 701, 19, 1),
        (1022, '2013-01-02 10:00:00', 701, 19, 0),
        (1023, '2013-01-02 10:00:00', 701, 54, 81)
;

with parted as (
    select floor((rn - 1) / 2.0) p, *
    from (
        select
            row_number() over (partition by deviceId order  by ts, datatype) rn,
            id, ts, deviceId, dataType, measure
        from t
        where not(datatype = 19 and measure = 0)
    ) s
)
select
    p1.id, p0.ts "start", p1.ts "end", p1.deviceId, p1.measure
from
    parted p0
    inner join
    parted p1 on
        p0.deviceId = p1.deviceId
        and p0.p = p1.p
        and p0.datatype = 19 and p1.datatype = 54
order by p1.id
;
  id  |        start        |         end         | deviceid | measure 
------+---------------------+---------------------+----------+---------
 1006 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      501 |      65
 1009 | 2013-01-02 10:20:00 | 2013-01-02 11:00:00 |      501 |      65
 1016 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      601 |     105
 1023 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      701 |      81

Upvotes: 2

Related Questions