Reputation: 13
I am using SQL Server and I have a table like this:
CREATE TABLE #Test
(
ID varchar(9),
[year] varchar(7),
Code varchar(9)
)
INSERT INTO #test (ID, [year], code)
VALUES ('8479', '2006', 'A'), ('8479', '2007', 'A'),
('8479', '2008', 'A'), ('8479', '2009', NULL),
('8479', '2010', NULL), ('8479', '2011_H1', NULL),
('8479', '2011_H2', NULL), ('8479', '2012_H1', 'B'),
('8479', '2012_H2', 'B'), ('8479', '2013_H1', 'B'),
('8479', '2013_H2', 'B'),
('5477', '2008', 'Z'), ('5477', '2009', 'Z'),
('5477', '2010', 'Z'), ('5477', '2011_H1', NULL),
('5477', '2011_H2', 'R')
I would like to populate the NULL code values with the nearest value available for that particular ID (nearest in order of year, and in any direction). In cases where the NULL is equidistant from values, the later year should take priority.
I would therefore want this result to be:
ID year Code
-------------------
8479 2006 A
8479 2007 A
8479 2008 A
8479 2009 A
8479 2010 A
8479 2011_H1 B
8479 2011_H2 B
8479 2012_H1 B
8479 2012_H2 B
8479 2013_H1 B
8479 2013_H2 B
5477 2008 Z
5477 2009 Z
5477 2010 Z
5477 2011_H1 R
5477 2011_H2 R
Some years are split into two - H1 followed by H2.
The number of NULL entries will vary for each ID.
I got as far as trying something along the lines of:
Update #test
set Code = (select top 1 b.Code
from #Test b
where LEFT(#test.[year],4) > LEFT(b.[year],4)
and #test.ID = b.ID
and b.Code IS NOT NULL)
Where Code IS NULL
but this obviously will not work and highlights the two main issues I'm having:
What can I do to be able to assign order to 'year' entries with H1 and H2 suffixes?
And how can I incorporate functionality that will find the nearest value regardless of direction?
I'm fairly new to this, so any help is much appreciated. I should add that I've inherited this database, and at the moment I'm not in a position to make any fundamental changes to the structure - I realise if my years were stored as dates then I would be half way there! Thanks.
Upvotes: 1
Views: 213
Reputation: 76787
The following code sets Code
to the desired code, selected inside a sub-query from records having similar ID
and Code
being not null
(very important), ordered by the year
difference (the smaller the difference the better) and similar records are ordered by year
descendingly, to choose the newer year
in case of similar difference. This update
is executed for each record which has a null
Code
:
update #Test
set Code = (select top(1) t.Code
from #Test t
where t.ID = #Test.ID and not (t.Code is null)
ORDER BY Abs(Cast(Left(#test.year, 4) as smallint) - Cast(Left(t.year, 4) as smallint)), t.year desc)
where Code is null
However, this is problematic, because in the case when a given ID
has only null
Code
values, the result will be null
as well. Let's suppose we have a 'default'
code for such cases. Then, you can do this:
update #Test
set Code = ifnull((select top(1) t.Code
from #Test t
where t.ID = #Test.ID and not (t.Code is null)
ORDER BY Abs(Cast(Left(#test.year, 4) as smallint) - Cast(Left(t.year, 4) as smallint)), t.year desc), 'default')
where Code is null
Upvotes: 0
Reputation: 140
Try something like this. Its quite slow method, but if you need to do it just once it should do.
UPDATE #test
SET
#test.code = (
SELECT
TOP(1)
ts.code
FROM #test ts
WHERE #test.ID = ts.ID
AND ts.code IS NOT NULL
ORDER BY Abs(Cast(Left(#test.[year], 4) as smallint) - Cast(Left(ts.[year], 4) as smallint))
)
WHERE #test.code IS NULL
Upvotes: 2
Reputation: 24783
try this
; WITH test_data as
(
SELECT ID, [year], [Code], rn = row_number() OVER (PARTITION BY ID ORDER BY [year])
FROM #Test
)
UPDATE t
SET Code = c.Code
FROM test_data t
CROSS APPLY
(
SELECT TOP 1 x.Code
FROM test_data x
WHERE x.ID = t.ID
AND x.Code is not null
ORDER BY ABS(t.rn - x.rn)
) c
WHERE t.Code IS NULL
Upvotes: 2