user14201215
user14201215

Reputation: 63

Postgres - select only rows with changed column value

In postgres I need to select only rows where column "state" has changed (for each "device_id"). For example when I have states OK, OK, OK, ERR, OK on device1 in source table I want to select only OK, ERR, OK (duplicated nearby states are not selected). Example bellow:

Source table:
ROW_ID  TIMESTAMP           DEVICE_ID   State
150683  2020-08-30 17:03:47 DEV1        OK
150683  2020-08-30 17:03:47 DEV2        OK
150574  2020-08-30 13:57:39 DEV1        OK
150573  2020-08-30 13:56:46 DEV2        ERR
150573  2020-08-30 13:56:46 DEV1        ERR
150566  2020-08-30 13:52:39 DEV1        OK
150306  2020-08-30 06:41:46 DEV2        OK
150280  2020-08-30 05:57:18 DEV2        OK
150280  2020-08-30 05:57:18 DEV1        OK
150257  2020-08-30 05:56:53 DEV1        OK 

Result of select what I need:
ROW_ID  TIMESTAMP           DEVICE_ID   State
150683  2020-08-30 17:03:47 DEV2        OK
150574  2020-08-30 13:57:39 DEV1        OK
150573  2020-08-30 13:56:46 DEV2        ERR
150573  2020-08-30 13:56:46 DEV1        ERR
150280  2020-08-30 05:57:18 DEV2        OK
150257  2020-08-30 05:56:53 DEV1        OK

Upvotes: 6

Views: 1906

Answers (2)

You want to filter out every row having same state as its neigbors:

with s as (
  select timestamp, device_id, state
       , lag(state) over (partition by device_id order by timestamp) as predecessor
       , lead(state) over (partition by device_id order by timestamp) as successor
  from t
)
select timestamp, device_id, state
from s
where state is distinct from predecessor or state is distinct from successor

(just from top of my head, I didn't tested it)

EDIT: I changed = to is not distinct from operator since = would behave incorrectly on result set margins (thanks to @LaurenzAlbe's answer)

Upvotes: 2

Laurenz Albe
Laurenz Albe

Reputation: 246163

Use the power of the lag window function:

SELECT timestamp, device_id, state
FROM (SELECT timestamp, device_id, state,
             lag(state) OVER (PARTITION BY device_id
                              ORDER BY timestamp)
                AS prev_state
      FROM mytable) AS q
WHERE state IS DISTINCT FROM prev_state;

Upvotes: 7

Related Questions