Hey StackExchange
Hey StackExchange

Reputation: 2125

Update with lag

I would like to set the ACTIVE value of a table as follow:

and so on and so forth.

Example

ID | FLAG | ACTIVE
---+------+-------
1  | E    | 1
2  | V    | 1
3  | H    | 0
4  | V    | 0
5  | E    | 1
6  | S    | 1
7  | V    | 1
8  | D    | 1
9  | H    | 0

The value are ordered by date. For simplicity, I added an ID column to get the column order.

Question

What can be the SQL update statement ?

Note:

The business rule can also be expressed as follow:

If for a given row, the count of preceding E - the count of preceding H is 1, then ACTIVE is 1 for this row, 0 otherwise.

Upvotes: 0

Views: 1228

Answers (1)

Alex Poole
Alex Poole

Reputation: 191425

You can get the active value with the last_value() analytic function:

select id, flag,
  last_value(case when flag = 'E' then 1 when flag = 'H' then 0 end) ignore nulls
    over (order by id) as active
from your_table;

As a demo:

create table your_table (id, flag) as
          select 1, 'E' from dual
union all select 2, 'V' from dual
union all select 3, 'H' from dual
union all select 4, 'V' from dual
union all select 5, 'E' from dual
union all select 6, 'S' from dual
union all select 7, 'V' from dual
union all select 8, 'D' from dual
union all select 9, 'H' from dual;

select id, flag,
  last_value(case when flag = 'E' then 1 when flag = 'H' then 0 end) ignore nulls
    over (order by id) as active
from your_table;

        ID F     ACTIVE
---------- - ----------
         1 E          1
         2 V          1
         3 H          0
         4 V          0
         5 E          1
         6 S          1
         7 V          1
         8 D          1
         9 H          0

You can use the same thing for an update, though a merge is probably going to be simpler:

alter table your_table add active number;

merge into your_table
using (
  select id,
    last_value(case when flag = 'E' then 1 when flag = 'H' then 0 end) ignore nulls
      over (order by id) as active
  from your_table
) tmp
on (your_table.id = tmp.id)
when matched then update set active = tmp.active;

9 rows merged.

select * from your_table;

        ID F     ACTIVE
---------- - ----------
         1 E          1
         2 V          1
         3 H          0
         4 V          0
         5 E          1
         6 S          1
         7 V          1
         8 D          1
         9 H          0

db<>fiddle demo.


You said your real data is actually ordered by a date, and I guess there are multiple flags for each of multiple IDs, so something like this is probably more realistic:

create table your_table (id, flag_time, flag) as
          select 1, timestamp '2018-07-04 00:00:00', 'E' from dual
union all select 1, timestamp '2018-07-04 00:00:01', 'V' from dual
union all select 1, timestamp '2018-07-04 00:00:02', 'H' from dual
union all select 1, timestamp '2018-07-04 00:00:03', 'V' from dual
union all select 1, timestamp '2018-07-04 00:00:04', 'E' from dual
union all select 1, timestamp '2018-07-04 00:00:05', 'S' from dual
union all select 1, timestamp '2018-07-04 00:00:06', 'V' from dual
union all select 1, timestamp '2018-07-04 00:00:07', 'D' from dual
union all select 1, timestamp '2018-07-04 00:00:08', 'H' from dual;

alter table your_table add active number;

merge into your_table
using (
  select id, flag_time,
    last_value(case when flag = 'E' then 1 when flag = 'H' then 0 end) ignore nulls
      over (partition by id order by flag_time) as active
  from your_table
) tmp
on (your_table.id = tmp.id and your_table.flag_time = tmp.flag_time)
when matched then update set active = tmp.active;

select * from your_table;

        ID FLAG_TIME               F     ACTIVE
---------- ----------------------- - ----------
         1 2018-07-04 00:00:00.000 E          1
         1 2018-07-04 00:00:01.000 V          1
         1 2018-07-04 00:00:02.000 H          0
         1 2018-07-04 00:00:03.000 V          0
         1 2018-07-04 00:00:04.000 E          1
         1 2018-07-04 00:00:05.000 S          1
         1 2018-07-04 00:00:06.000 V          1
         1 2018-07-04 00:00:07.000 D          1
         1 2018-07-04 00:00:08.000 H          0

The main difference is the partition by id and changing the ordering to use flag_time - or whatever your real columns are called.

db<>fiddle demo.


There is potentially an issue if two flags can share a time; with a timestamp column that's hopefully very unlikely, but with a date the precision of the column may allow it. There isn't much you can do about that though, except maybe get into some logic to break ties by assuming flags should arrive in a certain order, and give them a weighting based on that. Rather off-topic though.

Upvotes: 1

Related Questions