Trevor D
Trevor D

Reputation: 63

Last_value with IGNORE NULLS in SQL Server

I have a time series that with null values. I want to be replace each null value with the most recent non-non value. From what I've researched, Oracle SQL can easily accomplish this using Last_value with IGNORE NULLS. Is there a similar way to accomplish this using SQL Server 2016? Otherwise I'm just going to code it using C#, but felt using SQL would be faster, cleaner, and easier.

Sec SCORE
1   Null
2   Null
3   5
4   Null
5   8
6   7
7   Null

Should be replaced with:

Sec SCORE
1   Null
2   Null
3   5
4   5
5   8
6   7
7   7

Upvotes: 6

Views: 17371

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

You can do this with two cumulative operations:

select t.*,
       coalesce(score, max(score) over (partition by maxid)) as newscore
from (select t.*,
             max(case when score is not null then id end) over (order by id) as maxid
      from t
     ) t;

The innermost subquery gets the most recent id where there is a value. The outermost one "spreads" that value to the subsequent rows.

If you actually want to update the table, you can incorporate this easily into an update. But, Oracle cannot do that (easily), so I'm guessing this is not necessary....

Upvotes: 11

Apneal
Apneal

Reputation: 583

If performance is an issue, I suggest the solution from this article:

The Last non NULL Puzzle

His final solution, while dense, does perform excellently with a linear query plan without any joins. Here is an example implementation I've used which carries the last customer name through a type2 scd staging table. In this staging table, NULL represents no update, and '*** DELETED ***' represents an explicit set to NULL. The following cleans this up to resemble an actual SCD record excellently:

WITH [SampleNumbered] AS (
    SELECT  *, ROW_NUMBER() OVER ( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [RowNumber]
    FROM [dbo].[SampleDimStage]
), [SamplePrep] AS (
    SELECT  [SampleId]
        ,   [StartDatetime]
        ,   CAST([RowNumber] AS BINARY(8)) + CAST([SampleGroupId] AS VARBINARY(255)) AS [BinarySampleGroupId]
        ,   CAST([RowNumber] AS BINARY(8)) + CAST([SampleStatusCode] AS VARBINARY(255)) AS [BinarySampleStatusCode]
    FROM [SampleNumbered]
), [SampleCleanUp] AS (
    SELECT  [SampleId]
        ,   [StartDatetime]
        ,   CAST(SUBSTRING(MAX([BinarySampleGroupId]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
                , 9, 255) AS VARCHAR(255)) AS [LastSampleGroupId]
        ,   CAST(SUBSTRING(MAX([BinarySampleStatusCode]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
                , 9, 255) AS VARCHAR(255)) AS [LastSampleStatusCode]
        ,   LEAD([StartDatetime]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [EndDatetime]
    FROM [SamplePrep]
)
SELECT  CAST([SampleId] AS NUMERIC(18)) AS [SampleId]
    ,   CAST(NULLIF([sc].[LastSampleGroupId],'*** DELETED ***') AS NUMERIC(18)) AS [GroupId]
    ,   CAST(NULLIF([sc].[LastSampleStatusCode],'*** DELETED ***') AS CHAR(3)) AS [SampleStatusCode]
    ,   [StartDatetime]
    ,   [sc].[EndDatetime]
FROM [SampleCleanUp] [sc];

If your sort key is some sort of integer, you can completely skip the first CTE and cast that directly to binary.

Upvotes: 3

Related Questions