Reputation: 63
I have a time series that with null values. I want to be replace each null value with the most recent non-non value. From what I've researched, Oracle SQL can easily accomplish this using Last_value with IGNORE NULLS. Is there a similar way to accomplish this using SQL Server 2016? Otherwise I'm just going to code it using C#, but felt using SQL would be faster, cleaner, and easier.
Sec SCORE
1 Null
2 Null
3 5
4 Null
5 8
6 7
7 Null
Should be replaced with:
Sec SCORE
1 Null
2 Null
3 5
4 5
5 8
6 7
7 7
Upvotes: 6
Views: 17371
Reputation: 1271003
You can do this with two cumulative operations:
select t.*,
coalesce(score, max(score) over (partition by maxid)) as newscore
from (select t.*,
max(case when score is not null then id end) over (order by id) as maxid
from t
) t;
The innermost subquery gets the most recent id where there is a value. The outermost one "spreads" that value to the subsequent rows.
If you actually want to update the table, you can incorporate this easily into an update
. But, Oracle cannot do that (easily), so I'm guessing this is not necessary....
Upvotes: 11
Reputation: 583
If performance is an issue, I suggest the solution from this article:
His final solution, while dense, does perform excellently with a linear query plan without any joins. Here is an example implementation I've used which carries the last customer name through a type2 scd staging table. In this staging table, NULL represents no update, and '*** DELETED ***'
represents an explicit set to NULL. The following cleans this up to resemble an actual SCD record excellently:
WITH [SampleNumbered] AS (
SELECT *, ROW_NUMBER() OVER ( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [RowNumber]
FROM [dbo].[SampleDimStage]
), [SamplePrep] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleGroupId] AS VARBINARY(255)) AS [BinarySampleGroupId]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleStatusCode] AS VARBINARY(255)) AS [BinarySampleStatusCode]
FROM [SampleNumbered]
), [SampleCleanUp] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST(SUBSTRING(MAX([BinarySampleGroupId]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleGroupId]
, CAST(SUBSTRING(MAX([BinarySampleStatusCode]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleStatusCode]
, LEAD([StartDatetime]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [EndDatetime]
FROM [SamplePrep]
)
SELECT CAST([SampleId] AS NUMERIC(18)) AS [SampleId]
, CAST(NULLIF([sc].[LastSampleGroupId],'*** DELETED ***') AS NUMERIC(18)) AS [GroupId]
, CAST(NULLIF([sc].[LastSampleStatusCode],'*** DELETED ***') AS CHAR(3)) AS [SampleStatusCode]
, [StartDatetime]
, [sc].[EndDatetime]
FROM [SampleCleanUp] [sc];
If your sort key is some sort of integer, you can completely skip the first CTE and cast that directly to binary.
Upvotes: 3