Reputation: 23833
I have the following table definition
CREATE TABLE _Table
(
[Pat] NVARCHAR(8),
[Codes] NVARCHAR(50),
[C1] NVARCHAR(6),
[C2] NVARCHAR(6),
[C3] NVARCHAR(6),
[C4] NVARCHAR(6),
[C5] NVARCHAR(6)
);
GO
INSERT INTO _Table ([Pat], [Codes], [C1], [C2], [C3], [C4], [C5])
VALUES
('Pat1', 'U212,Y973,Y982', null, null, null, null, null),
('Pat2', 'M653', null, null, null, null, null),
('Pat3', 'U212,Y973,Y983,Z924,Z926', null, null, null, null, null);
GO
Now, I would like to split the codes for each row and populate the Cn
columns so we end up with
Pat Codes C1 C2 C3 C4 C5
Pat1 'U212,Y973,Y982' U212 Y973 Y982 NULL NULL
Pat2 'M653' M653 NULL NULL NULL NULL
Pat3 'U212,Y973,Y983,Z924,Z926' U212 Y973 Y983 Z924 Z926
I am looking at dynamic SQL but is there a better way...
I have started down the CTE route, but I am weak here. I am essentially looping, removing the first comma separated code and using left to get that code and selected it as C1
.
;WITH tmp([Pat], [Codes], [C1], [C2], [C3], [C4], [C5]) AS
(
SELECT
Pat,
STUFF(Codes, 1, CHARINDEX(',', Codes + ','), ''),
LEFT(Codes, CHARINDEX(',', Codes + ',') - 1),
[C2],
[C3],
[C4],
[C5]
FROM _Table
UNION all
SELECT
Pat,
STUFF(Codes, 1, CHARINDEX(',', Codes + ','), ''),
LEFT(Codes, CHARINDEX(',', Codes + ',') - 1),
[C2],
[C3],
[C4],
[C5]
FROM _Table
WHERE
Codes > ''
)
SELECT Pat, Codes, [C1], [C2], [C3], [C4], [C5]
FROM tmp
ORDER BY Pat
This works for one code, but how do I do all 5? Note, in practice this could increase to N codes.
Upvotes: 6
Views: 332
Reputation: 67291
My suggestion uses an updatable CTE and XML as splitter. XML allows to address each fragment by its position:
WITH updatableCTE AS
(
SELECT *
,CAST('<x>' + REPLACE(Codes,',','</x><x>') + '</x>' AS XML) AS CastedToXml
FROM _Table
)
UPDATE updatableCTE SET C1=CastedToXml.value('/x[1]','nvarchar(6)')
,C2=CastedToXml.value('/x[2]','nvarchar(6)')
,C3=CastedToXml.value('/x[3]','nvarchar(6)')
,C4=CastedToXml.value('/x[4]','nvarchar(6)')
,C5=CastedToXml.value('/x[5]','nvarchar(6)');
SELECT * FROM _Table
This is very easy to scale to any needed number of fragments.
Upvotes: 1
Reputation: 82474
Though I really like Alan Burstein's answer, another option involves using Jeff Moden's DelimitedSplit8K, conditional aggregation and stacking a couple of common table expressions to get the values you want:
;WITH CTE1 AS
(
SELECT Pat, Codes, ItemNumber, Item
FROM _Table
CROSS APPLY [dbo].[DelimitedSplit8K](Codes, ',')
), CTE2 AS
(
SELECT Pat,
Codes,
MAX(CASE WHEN ItemNumber = 1 THEN Item END) As V1,
MAX(CASE WHEN ItemNumber = 2 THEN Item END) As V2,
MAX(CASE WHEN ItemNumber = 3 THEN Item END) As V3,
MAX(CASE WHEN ItemNumber = 4 THEN Item END) As V4,
MAX(CASE WHEN ItemNumber = 5 THEN Item END) As V5
FROM CTE1
GROUP BY [Pat], [Codes]
)
UPDATE t
SET C1 = V1,
C2 = V2,
C3 = V3,
C4 = V4,
C5 = V5
FROM _Table t
JOIN CTE2 ON t.Pat = CTE2.Pat
You can see an online sample or rextester.
Upvotes: 1
Reputation: 7918
If I understand the requirement correctly this is extremely simple. No splitting or other type of function, Dynamic SQL, recursive CTEs, PIVOTING or any other skulduggery is necessary.
To perform the "split" you can use CROSS APPLY
like so:
SELECT
Pat,
Codes,
C1 = SUBSTRING(Codes,1,ISNULL(d1.d-1,8000)),
C2 = SUBSTRING(Codes,d1.d+1, d2.d-d1.d-1),
C3 = SUBSTRING(Codes,d2.d+1, d3.d-d2.d-1),
C4 = SUBSTRING(Codes,d3.d+1, d4.d-d3.d-1),
C5 = SUBSTRING(Codes,d4.d+1, 8000)
FROM _Table
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes),0))) d1(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d1.d+1),0))) d2(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d2.d+1),0))) d3(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d3.d+1),0))) d4(d);
Returns
Pat Codes C1 C2 C3 C4 C5
-------- ----------------------------- ----- ------ ----- ----- ------
Pat1 U212,Y973,Y982 U212 Y973 NULL NULL NULL
Pat2 M653 M653 NULL NULL NULL NULL
Pat3 U212,Y973,Y983,Z924,Z926 U212 Y973 Y983 Z924 Z926
Note the super-simple and utltra-efficient execution plan:
You can simplify this even further if the codes are always four character like so:
SELECT
Pat,
Codes,
C1 = NULLIF(SUBSTRING(Codes,1,4),''),
C2 = NULLIF(SUBSTRING(Codes,6,4),''),
C3 = NULLIF(SUBSTRING(Codes,11,4),''),
C4 = NULLIF(SUBSTRING(Codes,16,4),''),
C2 = NULLIF(SUBSTRING(Codes,21,4),'')
FROM _Table;
To perform the update you would do this for first solution:
UPDATE _Table
SET
C1 = SUBSTRING(Codes,1,ISNULL(d1.d-1,8000)),
C2 = SUBSTRING(Codes,d1.d+1, d2.d-d1.d-1),
C3 = SUBSTRING(Codes,d2.d+1, d3.d-d2.d-1),
C4 = SUBSTRING(Codes,d3.d+1, d4.d-d3.d-1),
C5 = SUBSTRING(Codes,d4.d+1, 8000)
FROM _Table
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes),0))) d1(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d1.d+1),0))) d2(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d2.d+1),0))) d3(d)
CROSS APPLY (VALUES (NULLIF(CHARINDEX(',',Codes,d3.d+1),0))) d4(d);
If, again, the codes are only four characters long then the update is so easy it feels like cheating:
UPDATE _Table
SET C1 = NULLIF(SUBSTRING(Codes,1,4),''),
C2 = NULLIF(SUBSTRING(Codes,6,4),''),
C3 = NULLIF(SUBSTRING(Codes,11,4),''),
C4 = NULLIF(SUBSTRING(Codes,16,4),''),
C5 = NULLIF(SUBSTRING(Codes,21,4),'');
Upvotes: 6
Reputation: 24410
Here's one possible solution: SQL Fiddle
with cte (PAT, CNum, Indx) as
(
select PAT
, 1
, 1
from _table
union all
select a.PAT
, b.CNum + 1
, charindex(',', a.CODES, b.Indx+1)+1
from _table a
inner join cte b
on b.PAT = a.PAT
where charindex(',', a.CODES, b.Indx+1) > 0
)
select t.PAT
--, t.CODES --include to see the original codes value
, max(case when c1.CNUM = 1 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end) C1
, max(case when c1.CNUM = 2 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end) C2
, max(case when c1.CNUM = 3 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end) C3
, max(case when c1.CNUM = 4 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end) C4
, max(case when c1.CNUM = 5 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end) C5
from _Table t
left outer join cte c1
on c1.PAT = t.PAT
left outer join cte c2
on c2.PAT = c1.PAT
and c2.CNum = c1.CNum + 1
group by t.PAT
--, t.CODES --include to see the original codes value
This uses a recursive CTE to fetch the starting positions of each of the values in CODES. The first position is taken as 1; subsequent positions are the char index of the next comma plus one (so we get the character after the comma). We record these positions in value INDX
.
We also have CNum
in our recursive CTE to record which (C#
) field the related record will be related to; this is a simple counter for each result returned.
i.e. For the example data on the SQL Fiddle link, the result of our CTE looks like this:
PAT | CNum | Indx
------+---------+-------
Pat1 | 1 | 1
Pat1 | 2 | 6
Pat1 | 3 | 11
Pat1 | 4 | 16
Pat1 | 5 | 21
Pat2 | 1 | 1
Pat3 | 1 | 1
Pat3 | 2 | 6
Pat3 | 3 | 11
Pat3 | 4 | 16
Pat3 | 5 | 21
Pat3 | 6 | 26
Pat3 | 7 | 31
Pat3 | 8 | 36
Pat3 | 9 | 41
Pat3 | 10 | 46
Pat4 | 1 | 1
We then collapse this down using our group by
expression; essentially doing a pivot, putting each CNum value against its related column.
The max
is just to ensure that we ignore all the null values / taken the only result with a value for the given CNum.
The code case when c1.CNUM = 1 then substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) end
performs a substring between the first character of the current item, taking its length up to 1 before the next character; or if there is no next character the length of the CODES string.
Update
Here's an updated version which makes use of the pivot
function / all other logic is as above: SQL Fiddle
with cte (PAT, CNum, Indx) as
(
select PAT
, 1
, 1
from @table
where CODES != ''
and CODES is not null
union all
select a.PAT
, b.CNum + 1
, charindex(',', a.CODES, b.Indx+1)+1
from @table a
inner join cte b
on b.PAT = a.PAT
where charindex(',', a.CODES, b.Indx+1) > 0
)
select PAT
, CODES
, [1] C1
, [2] C2
, [3] C3
, [4] C4
, [5] C5
from
(
select t.PAT
, t.CODES
, c1.CNum CNum1
, substring(t.CODES,c1.INDX,coalesce(c2.INDX,LEN(t.CODES)+2)-c1.INDX-1) value
from @table t
left outer join cte c1
on c1.PAT = t.PAT
left outer join cte c2
on c2.PAT = c1.PAT
and c2.CNum = c1.CNum + 1
) x
pivot
(
max(value)
for CNum1 in ([1],[2],[3],[4],[5])
) pvt
order by PAT
Upvotes: 1
Reputation: 175596
I have started down the CTE route, but I am weak here. I am essentially looping, removing the first comma separated code and using left to get that code and selected it as C1.
There are many possible' CSV splitters in SQL Server. I just want to extend your idea with cte (recursive version combined with conditional aggregation):
;WITH tmp([Pat], [Codes],x, lvl) AS
(
SELECT
Pat,
Codes = CAST(STUFF(Codes,1,CHARINDEX(',', Codes + ','), '')AS NVARCHAR(MAX)),
x = CAST(LEFT(Codes, CHARINDEX(',', Codes + ',') - 1) AS NVARCHAR(MAX)),
lvl = 1
FROM _Table
-- WHERE c1 IS NULL AND c2 IS NULL AND ...
-- to avoid recalculating the same rows if run many times
UNION ALL
SELECT Pat,
Codes = STUFF(Codes,1,CHARINDEX(',', Codes + ','), ''),
x= LEFT(Codes, CHARINDEX(',', Codes + ',') - 1),
lvl = lvl+1
FROM tmp
WHERE Codes > ''
)
SELECT Pat
,c1 = MAX(IIF(lvl=1, x, NULL))
,c2 = MAX(IIF(lvl=2, x, NULL))
,c3 = MAX(IIF(lvl=3, x, NULL))
,c4 = MAX(IIF(lvl=4, x, NULL))
,c5 = MAX(IIF(lvl=5, x, NULL))
-- add more if necessary
FROM tmp
GROUP BY Pat
-- OPTION (MAXRECURSION 0);
And UPDATE
:
;WITH tmp([Pat], [Codes],x, lvl) AS
(
SELECT
Pat,
Codes=CAST(STUFF(Codes,1,CHARINDEX(',',Codes+','),'')AS NVARCHAR(MAX)),
x = CAST(LEFT(Codes, CHARINDEX(',', Codes + ',') - 1) AS NVARCHAR(MAX)),
lvl = 1
FROM _Table
UNION ALL
SELECT Pat,
Codes = STUFF(Codes, 1, CHARINDEX(',', Codes + ','), ''),
x= LEFT(Codes, CHARINDEX(',', Codes + ',') - 1),
lvl = lvl+1
FROM tmp
WHERE Codes > ''
), cte2 AS (
SELECT Pat
,c1 = MAX(IIF(lvl=1, x, NULL))
,c2 = MAX(IIF(lvl=2, x, NULL))
,c3 = MAX(IIF(lvl=3, x, NULL))
,c4 = MAX(IIF(lvl=4, x, NULL))
,c5 = MAX(IIF(lvl=5, x, NULL))
FROM tmp
GROUP BY Pat
)
UPDATE _Table
SET c1 = c.c1
,c2 = c.c2
,c3 = c.c3
,c4 = c.c4
,c5 = c.c5
FROM _Table t
JOIN cte2 c
ON t.Pat = c.Pat
OPTION (MAXRECURSION 0);
Output:
╔══════╦══════════════════════════╦══════╦══════╦══════╦══════╦══════╗
║ Pat ║ Codes ║ C1 ║ C2 ║ C3 ║ C4 ║ C5 ║
╠══════╬══════════════════════════╬══════╬══════╬══════╬══════╬══════╣
║ Pat1 ║ U212,Y973,Y982 ║ U212 ║ Y973 ║ Y982 ║ null ║ null ║
║ Pat2 ║ M653 ║ M653 ║ null ║ null ║ null ║ null ║
║ Pat3 ║ U212,Y973,Y983,Z924,Z926 ║ U212 ║ Y973 ║ Y983 ║ Z924 ║ Z926 ║
╚══════╩══════════════════════════╩══════╩══════╩══════╩══════╩══════╝
Final thought: The correct way is to normalize schema.
Upvotes: 2
Reputation: 1289
What MatBailie is trying to get you to see is that having a comma separated list as a column, or having C1 - C(n) columns is a bad idea.
Better to make a second table. It will have a foreign key to the parent table (_Table in your example).
To do this you need to do a couple of things.
First, add an identity column to _Table. This will give you a unique key to use in other tables. Identity is an easy solution without knowing the full scope of your data. You could make it a different, unique column.
Second make a child table, for example PatCodes. It's columns should included:
Id int not null (identity)
_TableId int not null (fk to _Table)
Code varchar(xx) not null
(optional that might be nice to have)
Description varchar(xxx) null
DateUpdated datetime null
Active bit not null default 1
In this way you can have zero-to-many codes for each entry in _Table. You will not need to know the maximum (which could even change in the future) number of Cx columns to add to _Table. You will not be flooded with a large number of NULL column values in _Table. You have a ton of flexibility for managing updates to changing codes for a given _Table entry. I could go on and on about the benefits of normalizing this correctly from the start.
Changing your code to populate this table instead of an unknown number of columns on _Table is also easy to do.
Just do this now, before you end up with a ton of nasty data in the database and/or an application with loads of clutter around your _Table objects, it will save you much time and effort.
Upvotes: 0