Reputation: 5
I have machine output data in the form:
DATETIME ID VALUE
8-28-20 20:55:10 part1 13
8-28-20 20:56:60 part1 20
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
and I need to generate a new table by removing some rows:
DATETIME ID VALUE
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
The machine sometimes collects multiple VALUEs for each run, but I only need the last one (it's cumulative). However, I may have multiple runs of the same ID per shift, and it's not impossible to have 2 consecutive runs of the same ID.
Is it possible with SQL to filter out all rows where ID of a row is equal to the ID of the row above it, only if the VALUE is greater than the VALUE of the row above it?
There's some similar Qs posted here but they all result in grouping rows and taking a max value, but then I'd only capture one run per ID for each time period.
Upvotes: 0
Views: 238
Reputation: 6749
A bit more generic, and also as an example to get a session id without a specific OLAP function for that:
WITH
-- your input
input(dttm,id,value) AS (
SELECT TIMESTAMP '2020-08-28 20:55:10','part1',13
UNION ALL SELECT TIMESTAMP '2020-08-28 20:56:60','part1',20
UNION ALL SELECT TIMESTAMP '2020-08-28 20:57:22','part1',25
UNION ALL SELECT TIMESTAMP '2020-08-28 20:59:39','part2',9
UNION ALL SELECT TIMESTAMP '2020-08-28 21:10:55','part3',33
UNION ALL SELECT TIMESTAMP '2020-08-28 21:14:30','part1',14
)
,
-- add a counter that is at 1 whenever the id changes over time
with_chg AS (
SELECT
CASE
WHEN LAG(id) OVER(ORDER BY dttm) <> id THEN 1
ELSE 0
END AS chg_count
, *
FROM input
)
,
-- use the running sum of that change counter to get a session id
with_session AS (
SELECT
SUM(chg_count) OVER(ORDER BY dttm) AS session_id
, dttm
, id
, value
FROM with_chg
)
,
-- partition by the session id, order by datetime descending to get
-- the row number of 1 for the right row
with_rownum AS (
SELECT
ROW_NUMBER() OVER(PARTITION BY session_id ORDER BY dttm DESC) AS rownum
, dttm
, id
, value
FROM with_session
)
-- finally, filter by row number 1 and order back by datetime
SELECT
dttm
, id
, value
FROM with_rownum
WHERE rownum = 1
ORDER BY 1
;
-- out dttm | id | value
-- out ---------------------+-------+-------
-- out 2020-08-28 20:57:22 | part1 | 25
-- out 2020-08-28 20:59:39 | part2 | 9
-- out 2020-08-28 21:10:55 | part3 | 33
-- out 2020-08-28 21:14:30 | part1 | 14
Upvotes: 1
Reputation: 1270793
You seem to want the rows where the id
changes and the value increases:
select t.*
from (select t.*,
lead(id) over (order by datetime) as next_id,
lead(value) over (order by datetime) as next_value
from t
) t
where next_id is null or next_id <> id or
(next_id = id and next_value < value)
Upvotes: 0
Reputation: 37483
You can try the below - using row_number()
select * from
(
select *,
row_number() over(partition by dateadd(hour, datediff(hour, 0, DATETIME), 0), id order by DATETIME desc) as rn
from tablename
)A where rn=1
Upvotes: 0