db_noob
db_noob

Reputation: 135

Error when converting data to XML in SQL Server

I need to retrieve the content in position 10 of a comma separated string in a View table.

Row 1    N,l,S,T,A,,<all>,,N,A,N,N,N,Y,Y,,Y,Y,Y,,AA,SA,Enterprise, 
Row 2    M,,A,S,AS,SS,AS,N,N,N,N,Y,Y,Y,ENTERPRISE,S,,A
Row 3    L,,A,D,S,A,A,AA,Y,Y,Y,YNN,N,N,N,N,A,AA,AD,D,D

Div1 is the name of my column, Div2 is the name of the result column. I use the following code:

SELECT TOP (2000) 
    [Id],
    CONVERT(XML,'<x>' + REPLACE(REPLACE(REPLACE(Div1, '>', ''), '<', ''), ',', '</x <x>') + '</x>').value('/x[10]', 'VARCHAR(MAX)') [Div2], 
    Div1
FROM 
    [dbo].[database]

I use character type VARCHAR(MAX) because that is the type for Div1 in my database. The code works if I run less than 20000 rows. But the data set I use has more than 100,000 rows. If I run the whole data it stops and the following error occurs:

Msg 9421, Level 16, State 1, Line 1.
XML parsing: line 1, character 218, illegal name character

Is there a way to work this around?

Upvotes: 0

Views: 530

Answers (2)

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22157

XML has CDATA[] section to treat content as-is without parsing. There is no need for multiple REPLACE() function calls. Check it out.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE  (ID INT IDENTITY(1,1) PRIMARY KEY, Div1 VARCHAR(MAX));
INSERT INTO @tbl (Div1)
VALUES
('N,l,S,T,A,,<all>,,N,A,N,N,N,Y,Y,,Y,Y,Y,,AA,SA,Enterprise')
, ('M,,A,S,AS,SS,AS,N,N,N,N,Y,Y,Y,ENTERPRISE,S,,A')
, ('L,,A,D,S,A,A,AA,Y,Y,Y,YNN,N,N,N,N,A,AA,AD,D,D');
-- DDL and sample data population, end

SELECT [Id],
    CAST('<x><![CDATA[' + REPLACE(Div1, ',', ']]></x><x><![CDATA[') + ']]></x>' AS XML).value('(/x/text())[10]', 'VARCHAR(MAX)') [Div2], 
    Div1
FROM @tbl;

Upvotes: 1

PeterHe
PeterHe

Reputation: 2766

You can create a function to split string like below:

CREATE FUNCTION dbo.split_delimited_string
(
    @list varchar(max),
    @delimiter varchar(5)
)  
RETURNS @items TABLE 
(
    pos_id int identity(1,1),
    item varchar(255)
) 
AS  
BEGIN 
    DECLARE @pos int, @delimiter_len tinyint;
    SET @pos = CHARINDEX(@delimiter,@list);
    SET @delimiter_len=LEN(@delimiter);
    WHILE (@pos>0)
    BEGIN
        INSERT INTO @items (item)
        SELECT LEFT(@list,@pos - 1)

        SET @list = RIGHT(@list,LEN(@list) - @pos - @delimiter_len + 1);
        SET @pos = CHARINDEX(@delimiter,@list);
    END 
    IF @list<>N''
    BEGIN
        INSERT INTO @items (item)
        SELECT @list;
    END 
    RETURN;
END

The following query will return the content in the 10th position:

SELECT
    t.[Id],
    l.item AS Div2
    t.Div1
FROM [dbo].[database] t
CROSS APPLY dbo.split_delimited_string(t.Div1,',') l
WHERE l.pos_id = 10;

Upvotes: 0

Related Questions