ronniej962
ronniej962

Reputation: 173

SQL Server - How to split strings on a character in multiple columns

I need to split values in a column separated by the tilde (~) character that but for multiple columns in a single row. I can split one column using XML but I’m having problems figuring out how to split multiple columns.

This is what a row currently looks like this:

Column1    Column2               Column3                  Column4     
[JJ2222]   [~BLUE~BROWN~BLACK]   [~BB1234~BC2345~BD3456]  [~BLUE, BABY (BB1234)~BROWN, COW (BC2345)~BLACK, DOG (BD3456)]

After the split, I'm expecting multiple rows with all values aligned on the anchor (JJ2222):

Column1  Column2  Column3  Column4
JJ2222   BLUE     BB1234   BLUE, BABY (BB1234)
JJ2222   BROWN    BC2345   BROWN, COW (BC2345)
JJ2222   BLACK    BD3456   BLACK, DOG (BD3456)

In my limited knowledge, I could probably create 3 separate queries that splits and loads each column separately in to a temp table and then join each table on Column1 but I was hoping there was a way to do it in one query.

Upvotes: 3

Views: 1211

Answers (3)

Sean Lange
Sean Lange

Reputation: 33581

This is one of those times where I really like Jeff Moden's splitter. http://www.sqlservercentral.com/articles/Tally+Table/72993/ It is one of the only splitters I know of that returns the ordinal position of each element that doesn't resort to looping. As stated in the comments not storing data like this is by far the best option if at all possible. To solve this you have to split each column.

Something this should work for you. It works fine with your sample data.

declare @Something table
(
    Column1 varchar(20)
    , Column2 varchar(50)
    , Column3 varchar(50)
    , Column4 varchar(500)
)

insert @Something
select 'JJ2222', '~BLUE~BROWN~BLACK', '~BB1234~BC2345~BD3456', '~BLUE, BABY (BB1234)~BROWN, COW (BC2345)~BLACK, DOG (BD3456)'
;

select s.Column1
    , c2.Item
    , c3.Item
    , c4.Item
from @Something s
cross apply dbo.DelimitedSplit8K(s.Column2, '~') c2
cross apply dbo.DelimitedSplit8K(s.Column3, '~') c3
cross apply dbo.DelimitedSplit8K(s.Column4, '~') c4
where c2.Item > ''  --this eliminates an empty row because you have the delimiter at the beginning of the string.
    and c2.ItemNumber = c3.ItemNumber
    and c2.ItemNumber = c4.ItemNumber

Upvotes: 0

CodeSmith
CodeSmith

Reputation: 3207

Since the screwup is already there it won't help to make it into a different (a bit lesser one). What you need is to split that data into different tables.

All but Column 1 needs to become foreign key to another table.

For each combination of items in a column in original table you use appropriate additional table. Each of those tables look like:

CombinationId, Value

When you got for example:

[~BLUE~BROWN~BLACK]

You break it down into additional table Combinations2 like:

CombinationId Value

1 BLUE

1 BROWN

1 BLACK

You do this for all columns of original table. When you process original table row by row for each column you check if in the new table you created CombinationsX there is already combination like that sharing the same id. If so you put that Id in the original table. If not you add that combination to a new index and put that number as a foreign key into original table.

You have now split your data into multiple tables and made it usable.

What you wanna do from now on differs but at least you can now rework it into something different. At least now you can query over that data, search, index etc. The things you expect out of your DB to offer you when you don't fill it with junk ;)

Upvotes: 1

John Cappelletti
John Cappelletti

Reputation: 82010

There are many examples on how to split a string. The trick here is to link or join on the sequence.

If open to a UDF

Example

Select A.Column1 
      ,B.* 
 From  YourTable A
 Cross Apply (
                Select Column2=B1.RetVal
                      ,Column3=B2.RetVal
                      ,Column4=B3.RetVal
                 From [dbo].[tvf-Str-Parse](A.Column2,'~') B1
                 Join [dbo].[tvf-Str-Parse](A.Column3,'~') B2 on B1.RetSeq=B2.RetSeq
                 Join [dbo].[tvf-Str-Parse](A.Column4,'~') B3 on B1.RetSeq=B3.RetSeq
                 Where B1.RetVal is not null
                   and B2.RetVal is not null
                   and B3.RetVal is not null
             ) B

Returns

Column1 Column2 Column3 Column4
JJ2222  BLUE    BB1234  BLUE, BABY (BB1234)
JJ2222  BROWN   BC2345  BROWN, COW (BC2345)
JJ2222  BLACK   BD3456  BLACK, DOG (BD3456)

The UDF if Interested

CREATE FUNCTION [dbo].[tvf-Str-Parse] (@String varchar(max),@Delimiter varchar(10))
Returns Table 
As
Return (  
    Select RetSeq = Row_Number() over (Order By (Select null))
          ,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
    From  (Select x = Cast('<x>' + replace((Select replace(@String,@Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A 
    Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[tvf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[tvf-Str-Parse]('John Cappelletti was here',' ')

Upvotes: 1

Related Questions