Reputation: 3
I have a table with structure:
id, timestamp, deviceId, datatype, measure
The value of column measure represents value of the datatype. For example, when processing starts then datatype is 19 and measure 1. When it is completed then datatype is still 19 with value 0 and a new row inserted with same timestamp, datatype 54 and value as some value. This means on completion the system is calling some trigger to update this table. Example data below
1001, 2013-01-02 09:20:00, 501, 19, 1
1005, 2013-01-02 10:00:00, 501, 19, 0
1006, 2013-01-02 10:00:00, 501, 54, 65
Timestamp of 1005 & 1006 is same, timestamp of 1001 is always less than that of 1005
1011, 2013-01-02 09:20:00, 601, 19, 1
1015, 2013-01-02 10:00:00, 601, 19, 0
1016, 2013-01-02 10:00:00, 601, 54, 105
Timestamp of 1015 & 1016 is same, timestamp of 1011 is always less than that of 1015
1021, 2013-01-02 09:20:00, 701, 19, 1
1022, 2013-01-02 10:00:00, 701, 19, 0
1023, 2013-01-02 10:00:00, 701, 54, 81
Timestamp of 1022 & 1023 is same, timestamp of 1021 is always less than that of 1022
This same process can be happening simultaneously for multiple devices.
Now the requirement is to find the start and end time for each completed transaction like
1006, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 501, 65
1016, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 601, 105
1023, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 701, 81
I am writing SQL queries after some 5 years and completely stuck. Any pointers/suggestions will be highly appreciated.
Thanks in advance
Upvotes: 0
Views: 412
Reputation: 69789
It is possible I am massively over simplifying the issue here, but I don't see any reason why for each record with a datatype 54 you can't just access the previous record for that device where the datatype is 19 and the measure is 1:
SELECT result.ID,
result.DeviceID,
MAX(start.Timestamp) StartTime,
result.Timestamp EndTime,
result.Measure
FROM T result
INNER JOIN T start
ON start.DeviceID = result.DeviceID
AND start.Timestamp < result.Timestamp
AND start.DataType = 19
AND start.Measure = 1
WHERE result.DataType = 54
GROUP BY result.ID, result.DeviceID, result.Timestamp, result.Measure
The only real difference is that rather than trying to solve the problem by starting at the beginning and working forward to the result, i have started with the result and worked backwards to the start. This will fail if the process runs simultaneously for the same device (i.e one transaction starts before the previous one has ended)
Upvotes: 0
Reputation: 1270623
My logic for this is a simple aggregation. However, the aggregation key is the "next" record that has datatype 54, with the same device id.
To get this next record, I'm using a correlated subquery in the where
clause:
select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
MAX(case when id = next54 then measure end)
from (select t.*,
(select MIN(id) from t t2 where t2.id >= t.id and t2.datatype = 54 and t2.device_id = t.device_id) as next54
from t
) t
group by next54
The rest is aggregation.
Because I personally am not a big fan of correlated subqueries, you can also write this using the window functions (sometimes called analytic functions in Oracle):
select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
MAX(case when id = next54 then measure end)
from (select t.*,
min(id54) over (partition by device_id order by id desc) as next54
from (select t.*,
(case when datatype = 54 then id end) as id54
from t
) t
) t
group by next54
The min
function with an order by
clause does a "cumulative" minimum. The result should be the same as the correlated subquery.
Upvotes: 0
Reputation: 125444
CREATE TABLE t
(id int, ts timestamp, deviceId int, datatype int, measure int)
;
INSERT INTO t
(id, ts, deviceId, datatype, measure)
VALUES
(1001, '2013-01-02 09:20:00', 501, 19, 1),
(1005, '2013-01-02 10:00:00', 501, 19, 0),
(1006, '2013-01-02 10:00:00', 501, 54, 65),
(1007, '2013-01-02 10:20:00', 501, 19, 1),
(1008, '2013-01-02 11:00:00', 501, 19, 0),
(1009, '2013-01-02 11:00:00', 501, 54, 65),
(1011, '2013-01-02 09:20:00', 601, 19, 1),
(1015, '2013-01-02 10:00:00', 601, 19, 0),
(1016, '2013-01-02 10:00:00', 601, 54, 105),
(1021, '2013-01-02 09:20:00', 701, 19, 1),
(1022, '2013-01-02 10:00:00', 701, 19, 0),
(1023, '2013-01-02 10:00:00', 701, 54, 81)
;
with parted as (
select floor((rn - 1) / 2.0) p, *
from (
select
row_number() over (partition by deviceId order by ts, datatype) rn,
id, ts, deviceId, dataType, measure
from t
where not(datatype = 19 and measure = 0)
) s
)
select
p1.id, p0.ts "start", p1.ts "end", p1.deviceId, p1.measure
from
parted p0
inner join
parted p1 on
p0.deviceId = p1.deviceId
and p0.p = p1.p
and p0.datatype = 19 and p1.datatype = 54
order by p1.id
;
id | start | end | deviceid | measure
------+---------------------+---------------------+----------+---------
1006 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 501 | 65
1009 | 2013-01-02 10:20:00 | 2013-01-02 11:00:00 | 501 | 65
1016 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 601 | 105
1023 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 701 | 81
Upvotes: 2