kimi
kimi

Reputation: 525

How to get first row of each status?

I want to get the first row of each status for each id.

There can be multiple rows for each status. So I want to get the first occurrence of each status based on the previous status.

e.g. info_required first occurs at row 2, then it changes to another status pending at row 4, and then info_required again at row 6. Likewise, status pending first at row 4, then at row 8 since the status changed after row4, it needs to be in the resultset.

Hence below I want to get the row number 1, 2, 4, 6, and 8.

enter image description here

WITH t1 AS (
SELECT 1 AS row, 'A' AS id, 'created' AS status, '2021-05-18 18:30:00'::timestamp AS created_at UNION ALL
SELECT 2 AS row, 'A' AS id, 'info_required' AS status, '2021-05-19 11:30:00'::timestamp AS created_at UNION ALL
SELECT 3 AS row, 'A' AS id, 'info_required' AS status, '2021-05-19 12:00:00'::timestamp AS created_at UNION ALL
SELECT 4 AS row, 'A' AS id, 'pending' AS status, '2021-05-19 12:30:00'::timestamp AS created_at UNION ALL
SELECT 5 AS row, 'A' AS id, 'pending' AS status, '2021-05-20 13:30:00'::timestamp AS created_at UNION ALL
SELECT 6 AS row, 'A' AS id, 'info_required' AS status, '2021-05-20 14:30:00'::timestamp AS created_at UNION ALL
SELECT 7 AS row, 'A' AS id, 'info_required' AS status, '2021-05-20 15:30:00'::timestamp AS created_at UNION ALL
SELECT 8 AS row, 'A' AS id, 'pending' AS status, '2021-05-20 16:30:00'::timestamp AS created_at
    )
SELECT *
FROM t1

Upvotes: 0

Views: 305

Answers (2)

Lukasz Szozda
Lukasz Szozda

Reputation: 175556

Using CONDITIONAL_CHANGE_EVENT

WITH cte AS (
  SELECT *, CONDITIONAL_CHANGE_EVENT(status) over (partition by id 
                                                   order by created_at) AS cce
  FROM t1
)
SELECT *
FROM cte
QUALIFY ROW_NUMBER() OVER(PARTITION BY id, cce ORDER BY created_at) = 1;

enter image description here


Data preparation:

CREATE TABLE t1 AS 
WITH t1 AS (
SELECT 1 AS row_, 'A' AS id, 'created' AS status, '2021-05-18 18:30:00'::timestamp AS created_at UNION ALL
SELECT 2 AS row_, 'A' AS id, 'info_required' AS status, '2021-05-19 11:30:00'::timestamp AS created_at UNION ALL
SELECT 3 AS row_, 'A' AS id, 'info_required' AS status, '2021-05-19 12:00:00'::timestamp AS created_at UNION ALL
SELECT 4 AS row_, 'A' AS id, 'pending' AS status, '2021-05-19 12:30:00'::timestamp AS created_at UNION ALL
SELECT 5 AS row_, 'A' AS id, 'pending' AS status, '2021-05-20 13:30:00'::timestamp AS created_at UNION ALL
SELECT 6 AS row_, 'A' AS id, 'info_required' AS status, '2021-05-20 14:30:00'::timestamp AS created_at UNION ALL
SELECT 7 AS row_, 'A' AS id, 'info_required' AS status, '2021-05-20 15:30:00'::timestamp AS created_at UNION ALL
SELECT 8 AS row_, 'A' AS id, 'pending' AS status, '2021-05-20 16:30:00'::timestamp AS created_at
)
SELECT *
FROM t1;

Cte part:

SELECT *, CONDITIONAL_CHANGE_EVENT(status) over (partition by id 
                                              order by created_at) AS cce
FROM t1;

enter image description here

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269443

You can use lag() and qualify():

select t.*
from t
qualify lag(status) over (partition by id order by created_at) is distinct from status;

Upvotes: 1

Related Questions