Larx
Larx

Reputation: 111

How to select the pair of rows associated with status change

   Date           ID       X or Y 
-------------------------------
01.01.2016       1234        Y
01.01.2017       1234        X
01.01.2018       1234        Y
01.01.2019       1234        Y
01.01.2020       1234        Y
01.01.2021       1234        X
01.01.2016       4321        X
01.01.2017       4321        X
01.01.2018       4321        X
01.01.2019       4321        Y
01.01.2020       4321        Y

The above table shows the structure of the data I'm using. What I want to do is reducing it to another table where I only have the rows associated with the change in X/Y status; however, I do not only need the first observation after X becomes Y (or vice versa), but also the last observation before the change. How can I achieve the output that looks exactly like the below table with SQL running on Oracle database?

   Date           ID       X or Y 
-------------------------------
01.01.2016       1234        Y
01.01.2017       1234        X
01.01.2018       1234        Y
01.01.2020       1234        Y
01.01.2021       1234        X
01.01.2018       4321        X
01.01.2019       4321        Y

Upvotes: 0

Views: 59

Answers (2)

Barbaros Özhan
Barbaros Özhan

Reputation: 65373

Seems you need to use LEAD() and LAG() function together to filter them out :

WITH t2 AS
(
SELECT t.*,
       LAG(x_y,1,x_y) OVER (PARTITION BY id ORDER BY id, dt) AS lg_xy,
       LEAD(x_y,1,x_y) OVER (PARTITION BY id ORDER BY id, dt) AS ld_xy
  FROM t
 ORDER BY id, dt 
) 
SELECT dt, id, x_y
  FROM t2
 WHERE NOT ( x_y = lg_xy AND x_y = ld_xy )

Demo

Upvotes: 2

Littlefoot
Littlefoot

Reputation: 143023

Here's one option:

  • sample data from line #1 - 14
  • TEMP CTE: previous (LAG) and next (LEAD) x/y values per ID, sorted by date value
  • final select retrieves the result

SQL> with test (datum, id, xy) as
  2    (select date '2016-01-01', 1234, 'y' from dual union all
  3     select date '2017-01-01', 1234, 'x' from dual union all
  4     select date '2018-01-01', 1234, 'y' from dual union all
  5     select date '2019-01-01', 1234, 'y' from dual union all
  6     select date '2020-01-01', 1234, 'y' from dual union all
  7     select date '2021-01-01', 1234, 'x' from dual union all
  8     --
  9     select date '2016-01-01', 4321, 'x' from dual union all
 10     select date '2017-01-01', 4321, 'x' from dual union all
 11     select date '2018-01-01', 4321, 'x' from dual union all
 12     select date '2019-01-01', 4321, 'y' from dual union all
 13     select date '2020-01-01', 4321, 'y' from dual
 14    ),
 15  temp as
 16    (select datum, id, xy,
 17       lag(xy)  over (partition by id order by datum) laxy,
 18       lead(xy) over (partition by id order by datum) lexy
 19     from test
 20    )
 21  --
 22  select datum, id, xy
 23  from temp
 24  where xy <> laxy or xy <> lexy
 25  order by id, datum;

DATUM              ID X
---------- ---------- -
01.01.2016       1234 y
01.01.2017       1234 x
01.01.2018       1234 y
01.01.2020       1234 y
01.01.2021       1234 x
01.01.2018       4321 x
01.01.2019       4321 y

7 rows selected.

SQL>

Upvotes: 3

Related Questions