Reputation: 173
I need to split values in a column separated by the tilde (~) character that but for multiple columns in a single row. I can split one column using XML but I’m having problems figuring out how to split multiple columns.
This is what a row currently looks like this:
Column1 Column2 Column3 Column4
[JJ2222] [~BLUE~BROWN~BLACK] [~BB1234~BC2345~BD3456] [~BLUE, BABY (BB1234)~BROWN, COW (BC2345)~BLACK, DOG (BD3456)]
After the split, I'm expecting multiple rows with all values aligned on the anchor (JJ2222):
Column1 Column2 Column3 Column4
JJ2222 BLUE BB1234 BLUE, BABY (BB1234)
JJ2222 BROWN BC2345 BROWN, COW (BC2345)
JJ2222 BLACK BD3456 BLACK, DOG (BD3456)
In my limited knowledge, I could probably create 3 separate queries that splits and loads each column separately in to a temp table and then join each table on Column1 but I was hoping there was a way to do it in one query.
Upvotes: 3
Views: 1211
Reputation: 33581
This is one of those times where I really like Jeff Moden's splitter. http://www.sqlservercentral.com/articles/Tally+Table/72993/ It is one of the only splitters I know of that returns the ordinal position of each element that doesn't resort to looping. As stated in the comments not storing data like this is by far the best option if at all possible. To solve this you have to split each column.
Something this should work for you. It works fine with your sample data.
declare @Something table
(
Column1 varchar(20)
, Column2 varchar(50)
, Column3 varchar(50)
, Column4 varchar(500)
)
insert @Something
select 'JJ2222', '~BLUE~BROWN~BLACK', '~BB1234~BC2345~BD3456', '~BLUE, BABY (BB1234)~BROWN, COW (BC2345)~BLACK, DOG (BD3456)'
;
select s.Column1
, c2.Item
, c3.Item
, c4.Item
from @Something s
cross apply dbo.DelimitedSplit8K(s.Column2, '~') c2
cross apply dbo.DelimitedSplit8K(s.Column3, '~') c3
cross apply dbo.DelimitedSplit8K(s.Column4, '~') c4
where c2.Item > '' --this eliminates an empty row because you have the delimiter at the beginning of the string.
and c2.ItemNumber = c3.ItemNumber
and c2.ItemNumber = c4.ItemNumber
Upvotes: 0
Reputation: 3207
Since the screwup is already there it won't help to make it into a different (a bit lesser one). What you need is to split that data into different tables.
All but Column 1 needs to become foreign key to another table.
For each combination of items in a column in original table you use appropriate additional table. Each of those tables look like:
CombinationId, Value
When you got for example:
[~BLUE~BROWN~BLACK]
You break it down into additional table Combinations2 like:
CombinationId Value
1 BLUE
1 BROWN
1 BLACK
You do this for all columns of original table. When you process original table row by row for each column you check if in the new table you created CombinationsX there is already combination like that sharing the same id. If so you put that Id in the original table. If not you add that combination to a new index and put that number as a foreign key into original table.
You have now split your data into multiple tables and made it usable.
What you wanna do from now on differs but at least you can now rework it into something different. At least now you can query over that data, search, index etc. The things you expect out of your DB to offer you when you don't fill it with junk ;)
Upvotes: 1
Reputation: 82010
There are many examples on how to split a string. The trick here is to link or join on the sequence.
If open to a UDF
Example
Select A.Column1
,B.*
From YourTable A
Cross Apply (
Select Column2=B1.RetVal
,Column3=B2.RetVal
,Column4=B3.RetVal
From [dbo].[tvf-Str-Parse](A.Column2,'~') B1
Join [dbo].[tvf-Str-Parse](A.Column3,'~') B2 on B1.RetSeq=B2.RetSeq
Join [dbo].[tvf-Str-Parse](A.Column4,'~') B3 on B1.RetSeq=B3.RetSeq
Where B1.RetVal is not null
and B2.RetVal is not null
and B3.RetVal is not null
) B
Returns
Column1 Column2 Column3 Column4
JJ2222 BLUE BB1234 BLUE, BABY (BB1234)
JJ2222 BROWN BC2345 BROWN, COW (BC2345)
JJ2222 BLACK BD3456 BLACK, DOG (BD3456)
The UDF if Interested
CREATE FUNCTION [dbo].[tvf-Str-Parse] (@String varchar(max),@Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>' + replace((Select replace(@String,@Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[tvf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[tvf-Str-Parse]('John Cappelletti was here',' ')
Upvotes: 1