Legend
Legend

Reputation: 116940

Splitting a very large string with a custom delimiter?

I am trying to determine the word frequency in a column that is a VARCHAR(3000). I am not sure if this is the best data type but the table creation was not in hand. In any case, I have been using the following function (taken from here) to split strings up until this point:

CREATE FUNCTION dbo.Split
(
    @RowData nvarchar(2000),
    @SplitOn nvarchar(5)
)  
RETURNS @RtnValue table 
(
    Id int identity(1,1),
    Data nvarchar(100)
) 
AS  
BEGIN 
    Declare @Cnt int
    Set @Cnt = 1

    While (Charindex(@SplitOn,@RowData)>0)
    Begin
        Insert Into @RtnValue (data)
        Select 
            Data = ltrim(rtrim(Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)))

        Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+1,len(@RowData))
        Set @Cnt = @Cnt + 1
    End

    Insert Into @RtnValue (data)
    Select Data = ltrim(rtrim(@RowData))

    Return
END

Usage was as follows:

SELECT s FROM dbo.Split(' ', @description)

It has been working very nicely but now I am getting an error:

The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

Does anyone have suggestions on what is a good way of achieving this?

Upvotes: 1

Views: 2384

Answers (3)

Conrad Frix
Conrad Frix

Reputation: 52675

This function taken from here uses .Nodes and avoids loops and recursive CTES

CREATE FUNCTION dbo.Split(@data NVARCHAR(MAX), @delimiter NVARCHAR(5))
RETURNS @t TABLE (data NVARCHAR(max))
AS
BEGIN

    DECLARE @textXML XML;
    SELECT    @textXML = CAST('<d>' + REPLACE(@data, @delimiter, '</d><d>') + '</d>' AS XML);

    INSERT INTO @t(data)
    SELECT  T.split.value('.', 'nvarchar(max)') AS data
    FROM    @textXML.nodes('/d') T(split)

    RETURN
END
GO

Upvotes: 2

RC_Cleland
RC_Cleland

Reputation: 2304

I have a situation where the following code works well for me. The code makes use of the replace function and SQL 2008's ability to insert multiple rows with a single insert statement. The only drawback to this method, if it is really a drawback, is it is limited to 1000 splits.

    IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[StringSegmenter]') AND type in (N'U'))
DROP TABLE [dbo].[StringSegmenter]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[StringSegmenter]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[StringSegmenter](
    [ss_id] [int] IDENTITY(1,1) NOT NULL,
    [ss_segment] [varchar](max) NOT NULL,
 CONSTRAINT [PK_StringSegmenter] PRIMARY KEY CLUSTERED 
(
    [ss_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO

SET ANSI_PADDING OFF
GO

truncate table scratchpad.dbo.stringsegmenter
declare @String varchar(max)
declare @Splitter varchar(10)
set @String = '1,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666'
set @Splitter = ','
set @String = 'Insert [dbo].[StringSegmenter] Values (''' + replace(@string, @splitter,'''),(''') + ''')'
select @String
execute (@String)

Select * from [dbo].[StringSegmenter] order by ss_id

Upvotes: 0

Legend
Legend

Reputation: 116940

Never mind. Just in case someone else faces the same problem, the following from here works perfect on large strings:

CREATE FUNCTION dbo.SplitLarge(@String varchar(8000), @Delimiter char(1))     
returns @temptable TABLE (items varchar(8000))     
as     
begin     
    declare @idx int     
    declare @slice varchar(8000)     

    select @idx = 1     
        if len(@String)<1 or @String is null  return     

    while @idx!= 0     
    begin     
        set @idx = charindex(@Delimiter,@String)     
        if @idx!=0     
            set @slice = left(@String,@idx - 1)     
        else     
            set @slice = @String     

        if(len(@slice)>0)
            insert into @temptable(Items) values(@slice)     

        set @String = right(@String,len(@String) - @idx)     
        if len(@String) = 0 break     
    end 
return     
end

Upvotes: 2

Related Questions