Hawtinjr
Hawtinjr

Reputation: 49

TSQL - SQL Server fill Value with Value of previous filled date

I am trying to da a tricky thing in my fact table. My customer wants to fill every empty value in the fact table with the value of the previous filled date.

Fill Values with previous dates

It gets complicated, when I need to consinder, that it should only update the Values, if the context of the dimensions is the same.

This is the expected result:

expected Result

We're Using Microsoft Analysis Services so maybe this can also be done with MDX and LastNonEmpty, but for now it would be finde to have a working update Statement for the fact table. (Fact table with 0 values is generated by using the actual fact table cross joining all possible day combinations).

Thank you guys!

Upvotes: 0

Views: 52

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270773

You can do this with apply:

select t.dim1, t.dim2, t.dimdate, t2.value
from t outer apply
     (select top 1 t2.*
      from t t2
      where t2.dim1 = t.dim1 and t2.dim2 = t.dim2 and
            t2.dimdate <= t.dimdate and t2.value <> 0
      order by t2.dimdate desc
     ) t2;

SQL does offer functionality to do what you want using window functions. It is the IGNORE NULLS option on LAG(). Unfortunately, SQL Server does not implement this option.

You can do this with window functions, but it is a bit more cumbersome:

select t.dim1, t.dim2, t.dimdate,
       max(value) over (partition by dim1, dim2, effectivedimdate) as value
from (select t.*,
             (case when value = 0
                   then max(case when value <> 0 then dimdate end) over (partition by dim1, dim2 order by dimdate)
                   else dimdate
              end) as effective_dimdate
      from t
     ) t;

The max() scan gets the most recent dimdate that has a non-zero value. The value is then "spread" over all the values using another max() scan.

Note: This assumes that the value that you want to apply is always greater than zero. The code is easily modified to handle this, but the additional case expression just complicates the logic.

Either of these can easily be turned into an update. The second is particularly easy:

with toupdate as (
      select t.*,
             max(value) over (partition by dim1, dim2, effectivedimdate) as new_value
      from (select t.*,
                   (case when value = 0
                         then max(case when value <> 0 then dimdate end) over (partition by dim1, dim2 order by dimdate)
                         else dimdate
                    end) as effective_dimdate
            from t
           ) t
     )
update toupdate
    set value = newvalue
    where value = 0;

Upvotes: 1

Related Questions