Yoorizz
Yoorizz

Reputation: 217

How to transform a range of records to the values of the record after that range in SQL?

I am trying to replace some bad input records within a specific date range with correct records. However, I'm not sure if there is an efficient way to do so. Therefore my question is how to transform a (static) range of records to the values of the record after that range in SQL? Below you will find an example to clarify what I try to achieve.

enter image description here

In this example you can see that customer number 1 belongs to group number 0 (None) in the period from 25-06-2020 to 29-06-2020. From 30-06-2020 to 05-07-2020 this group number changes from 0 to 11 for customer number 1. This static period contains the wrong records, and should be changed to the values that are valid on 06-07-2020 (group number == 10). Is there a way to do this?

Upvotes: 1

Views: 54

Answers (3)

George Joseph
George Joseph

Reputation: 5932

You can do the following as an example. Here i have choosen the criteria that if role='Leader' its a bad record and therefore you would be applying the next available group_number --> in column group_number1, and role1.

I have used a smaller subset of the rows you have in your excel example.

  select date1
        ,customer_number
        ,group_number
        ,case when role='Leader' then 
                  (select t1.group_number
                     from t t1
                    where t1.date1>t.date1
                      and t1.role<>'Leader'
                   order by t1.date1 asc
                   limit 1
                  ) 
             else group_number 
         end as group_number1
        ,role
       ,case when role='Leader' then 
                  (select t1.role
                     from t t1
                    where t1.date1>t.date1
                      and t1.role<>'Leader'
                   order by t1.date1 asc
                   limit 1
                  ) 
             else role 
         end as role1
   from t
order by 1   

+------------+-----------------+--------------+---------------+--------+--------+
|   DATE1    | CUSTOMER_NUMBER | GROUP_NUMBER | GROUP_NUMBER1 |  ROLE  | ROLE1  |
+------------+-----------------+--------------+---------------+--------+--------+
| 2020-06-25 |               1 |            0 |             0 | None   | None   |
| 2020-06-26 |               1 |            0 |             0 | None   | None   |
| 2020-06-27 |               1 |            0 |             0 | None   | None   |
| 2020-06-28 |               1 |            0 |             0 | None   | None   |
| 2020-06-29 |               1 |            0 |             0 | None   | None   |
| 2020-06-30 |               1 |           11 |            10 | Leader | Member |
| 2020-07-01 |               1 |           11 |            10 | Leader | Member |
| 2020-07-06 |               1 |           10 |            10 | Member | Member |
+------------+-----------------+--------------+---------------+--------+--------+

db fiddle link https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=c95d12ced067c1df94947848b5a94c14

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270463

If I understand correctly, you can use window functions to get the data on that particular date and case logic to assign it to the specific date range:

select t.*,
       (case when date >= '2020-07-01' and date <= '2020-07-05'
             then max(case when date = '2020-07-06' then group_number end) over (partition by customer_number)
             else group_number
        end) as imputed_group_number,
       (case when date >= '2020-07-01' and date <= '2020-07-05'
             then max(case when date = '2020-07-06' then role end) over (partition by customer_number)
             else role
        end) as imputed_role
from t;

If you want to update the values, you can use JOIN:

update t
    set group_number = tt.group_number,
        role = tt.role
    from tt
    where tt.customer_number = t.customer_number and tt.date = '2020-07-06'

Upvotes: 1

GMB
GMB

Reputation: 222582

I think that window function first_value() does what you want:

select 
    date,
    customer_number,
    first_value(group_number) over(partition by customer_number order by date) group_number,
    first_value(role)         over(partition by customer_number order by date) role
from mytable

Upvotes: 1

Related Questions