SBB
SBB

Reputation: 8970

T-SQL split on delimiter

I am working with an employee hierarchy string that is in the format of the following. These number represent employeeID numbers and how the are structured within the company, thus being able to follow the chain of management.

123|456|789|012|345|320

I am trying to take this string of data and turn it into a temp table so I can work with each of the ID's as their own value.

I tried making a function to split the string:

ALTER FUNCTION [dbo].[SplitString]
    (@String NVARCHAR(4000),
     @Delimiter NCHAR(1))
RETURNS TABLE
AS
    RETURN
        (WITH Split(stpos, endpos) AS
         (
             SELECT 0 AS stpos, CHARINDEX(@Delimiter, @String) AS endpos
             UNION ALL
             SELECT endpos + 1, CHARINDEX(@Delimiter, @String, endpos+1)
             FROM Split
             WHERE endpos > 0
         )
         SELECT 
             'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
             'Data' = SUBSTRING(@String, stpos, COALESCE(NULLIF(endpos, 0), LEN(@String) + 1))
         FROM 
             Split
)

This however resulted in the following:

Id  Data 
-------------------
1   123
2   456|7893
3   7893|012|345|
4   012|345|320
5   345|320
6   320

Is there a better way to approach this, maybe not needing a function at all or will it be required to achieve this?

Upvotes: 5

Views: 2217

Answers (3)

John Cappelletti
John Cappelletti

Reputation: 81970

Without a Parse Function

Declare @YourTable table (ID int,IDList varchar(Max))
Insert Into @YourTable values
(1,'123|456|789|012|345|320'),
(2,'123|456')

Select A.ID
      ,B.*
 From @YourTable A
 Cross Apply (
                Select RetSeq = Row_Number() over (Order By (Select null))
                      ,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
                From (Select x = Cast('<x>'+ replace((Select A.IDList as [*] For XML Path('')),'|','</x><x>')+'</x>' as xml).query('.')) as A  
                Cross Apply x.nodes('x') AS B(i)
             ) B

Returns

ID  RetSeq  RetVal
1   1       123
1   2       456
1   3       789
1   4       012
1   5       345
1   6       320
2   1       123
2   2       456

OR with the SUPER DUPER Parse (orig source listed below / couple of tweaks)

Select A.ID
      ,B.*
 From @YourTable A
 Cross Apply [dbo].[udf-Str-Parse-8K](A.IDList,'|') B

Would Return the same as above

CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (@String varchar(max),@Delimiter varchar(10))
Returns Table 
As
Return (  
    with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
           cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter)) = @Delimiter),
           cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter,@String,s.N),0)-S.N,8000) From cte3 S)

    Select RetSeq = Row_Number() over (Order By A.N)
          ,RetVal = Substring(@String, A.N, A.L) 
    From   cte4 A
);
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/
--Much faster than str-Parse, but limited to 8K
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')

Edit - Stand Alone

Declare @String varchar(max) = '123|456|789|012|345|320'
Declare @Delim  varchar(10)  = '|'

Select RetSeq = Row_Number() over (Order By (Select null))
      ,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>'+ replace((Select @String as [*] For XML Path('')),@Delim,'</x><x>')+'</x>' as xml).query('.')) as A 
Cross Apply x.nodes('x') AS B(i)

Upvotes: 6

Alan Burstein
Alan Burstein

Reputation: 7918

If you need a string "splitter" the fastest one available for 2012 (pre- 2016) is going to be found here. This will blow the doors off of anything posted thusfar. If your items/tokens are all the same size then an even faster method would be this:

DECLARE @yourstring varchar(8000) = '123|456|789|012|345|320';

WITH E(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(v)),
iTally(N) AS (SELECT TOP ((LEN(@yourstring)/4)+1) ROW_NUMBER() OVER (ORDER BY (SELECT 1)) 
              FROM e a, e b, e c, e d)
SELECT itemNumber = ROW_NUMBER() OVER (ORDER BY N), item = SUBSTRING(@yourstring, ((N*4)-3), 3) 
FROM iTally;

Results:

itemNumber           item
-------------------- ----
1                    123
2                    456
3                    789
4                    012
5                    345
6                    320

I write more about this and provide examples of how to put this logic into a function here.

Upvotes: 2

M.Ali
M.Ali

Reputation: 69524

I use this version of the Split function.

CREATE FUNCTION [dbo].[Split]
(
  @delimited nvarchar(max),
  @delimiter nvarchar(100)
) RETURNS @t TABLE
(
-- Id column can be commented out, not required for sql splitting string
  id int identity(1,1), -- I use this column for numbering splitted parts
  val nvarchar(max)
)
AS
BEGIN
  declare @xml xml
  set @xml = N'<root><r>' + replace(@delimited,@delimiter,'</r><r>') + '</r></root>'

  insert into @t(val)
  select
    r.value('.','varchar(max)') as item
  from @xml.nodes('//root/r') as records(r)

  RETURN
END

You Query would look something like....

SELECT * 
FROM TableName t 
  CROSS APPLY [dbo].[Split](t.EmpIDs, '|')

Upvotes: 1

Related Questions