Alex Bartsmon
Alex Bartsmon

Reputation: 550

SQL Conditional Look Ahead

I would like to write a query that identifies the "next" value in ordered set that satisfies a condition. LEAD/LAG analytic functions don't seem applicable here as the number of rows to look ahead is variable (not fixed) based upon the condition. The below example shows the desired result (column gnme) from the sample table (tbl), but the solution seems non-ideal. Was hoping someone here may have a more elegant solution for such a problem. Thanks in advance.

Notice how in this example rows 1-3 identify the nme mike in row 4, and rows 6-7 identify the nme michael in row 8.

create table tbl (
  id number
  ,nme varchar(255)
)
;

insert into tbl (id, nme) values (1,'unknown');
insert into tbl (id, nme) values (2,'unknown');
insert into tbl (id, nme) values (3,'unknown');
insert into tbl (id, nme) values (4,'mike');
insert into tbl (id, nme) values (5,'mike');
insert into tbl (id, nme) values (6,'unknown');
insert into tbl (id, nme) values (7,'unknown');
insert into tbl (id, nme) values (8,'michael');
insert into tbl (id, nme) values (9,'michael');
insert into tbl (id, nme) values (10,'michael');
insert into tbl (id, nme) values (11,'unknown');
 
select
  id
  ,nme
  ,CASE WHEN nme = 'unknown' THEN 
          NVL
          (
           (SELECT b.nme 
            FROM tbl b 
            WHERE 
              b.nme <> 'unknown'
              AND a.id < b.id 
            ORDER BY id 
            OFFSET 0 ROWS FETCH NEXT 1 ROW ONLY
           )
           , nme
          ) 
        ELSE nme 
        END AS gnme
FROM
  tbl a
;

+----+---------+---------+
| id | nme     | gnme    |
+----+---------+---------+
| 1  | unknown | mike    |
+----+---------+---------+
| 2  | unknown | mike    |
+----+---------+---------+
| 3  | unknown | mike    |
+----+---------+---------+
| 4  | mike    | mike    |
+----+---------+---------+
| 5  | mike    | mike    |
+----+---------+---------+
| 6  | unknown | michael |
+----+---------+---------+
| 7  | unknown | michael |
+----+---------+---------+
| 8  | michael | michael |
+----+---------+---------+
| 9  | michael | michael |
+----+---------+---------+
| 10 | michael | michael |
+----+---------+---------+
| 11 | unknown | unknown |
+----+---------+---------+

Upvotes: 3

Views: 838

Answers (3)

marcothesane
marcothesane

Reputation: 6749

You can also just use LAST_VALUE() with IGNORE NULLS:

WITH
-- your input
tbl(id,nme) AS (
          SELECT 1,'unknown'
UNION ALL SELECT 2,'unknown'
UNION ALL SELECT 3,'unknown'
UNION ALL SELECT 4,'mike'
UNION ALL SELECT 5,'mike'
UNION ALL SELECT 6,'unknown'
UNION ALL SELECT 7,'unknown'
UNION ALL SELECT 8,'michael'
UNION ALL SELECT 9,'michael'
UNION ALL SELECT 10,'michael'
UNION ALL SELECT 11,'unknown'
)
SELECT
  *
, NVL(
    LAST_VALUE(NULLIF(nme,'unknown') IGNORE NULLS) OVER(
      ORDER BY id DESC
    )
  , 'unknown'
  ) AS gnme
FROM tbl
ORDER BY id;
-- out  id |   nme   |  gnme   
-- out ----+---------+---------
-- out   1 | unknown | mike
-- out   2 | unknown | mike
-- out   3 | unknown | mike
-- out   4 | mike    | mike
-- out   5 | mike    | mike
-- out   6 | unknown | michael
-- out   7 | unknown | michael
-- out   8 | michael | michael
-- out   9 | michael | michael
-- out  10 | michael | michael
-- out  11 | unknown | unknown

Upvotes: 1

Sayan Malakshinov
Sayan Malakshinov

Reputation: 8655

You can use first_value analytic function:

select
  id
  ,nme
  ,nvl(
        first_value(nullif(nme,'unknown') ignore nulls)over(order by id ROWS between current row and unbounded following) 
        ,'unknown')
        AS gnme
FROM
  tbl a
;

Full example to compare:

select
  id
  ,nme
  ,CASE WHEN nme = 'unknown' THEN 
          NVL
          (
           (SELECT b.nme 
            FROM tbl b 
            WHERE 
              b.nme <> 'unknown'
              AND a.id < b.id 
            ORDER BY id 
            OFFSET 0 ROWS FETCH NEXT 1 ROW ONLY
           )
           , nme
          ) 
        ELSE nme 
        END AS gnme
    ,nvl(
        first_value(nullif(nme,'unknown') ignore nulls)over(order by id ROWS between current row and unbounded following) 
        ,'unknown')
        AS gnme_2
FROM
  tbl a
;

Results:

        ID NME        GNME       GNME_2
---------- ---------- ---------- ----------
         1 unknown    mike       mike
         2 unknown    mike       mike
         3 unknown    mike       mike
         4 mike       mike       mike
         5 mike       mike       mike
         6 unknown    michael    michael
         7 unknown    michael    michael
         8 michael    michael    michael
         9 michael    michael    michael
        10 michael    michael    michael
        11 unknown    unknown    unknown

11 rows selected.

Upvotes: 1

GMB
GMB

Reputation: 222582

When a name is unknown, you want the next non-unknow name.

Oracle is one of the rare databases that support the ignore nulls options to window functions lead() and lag(). This is a powerful feature, that comes handy for your use case:

select 
   id,
   nme,
   case when nme = 'unknown'
       then lead(nullif(nme,'unknown') ignore nulls, 1, 'unknown') over(order by id)
       else nme
   end gnme
from tbl

The case expression within lead() turns value 'unknow' to null, then the function brings the next non null value (an defaults to unknown if there is none available).

Upvotes: 1

Related Questions