David
David

Reputation: 1061

t-sql: convert varchar to xml

I have some valid and invalid xml values stored in a varchar column.

I'd like to cast the valid xml values to the actual xml data type and invalid to nulls.

What's a good way to do this ?

Something like:

SELECT 
    CASE WHEN dbo.isValidXML(xml_data) THEN CAST(xml_data as XML)
    ELSE null 
END

Upvotes: 13

Views: 45205

Answers (3)

Devart
Devart

Reputation: 121902

Try to use sp_xml_preparedocument -

SET NOCOUNT ON;

DECLARE @XML NVARCHAR(MAX)
SELECT @XML = '<t>test</'

DECLARE @hDoc INT

BEGIN TRY

     EXEC sys.sp_xml_preparedocument @hDoc OUTPUT, @XML 
     SELECT '"' + @XML + '" is valid'
     EXEC sys.sp_xml_removedocument @hDoc

END TRY
BEGIN CATCH
     SELECT '"' + @XML + '" is invalid'     
END CATCH

SELECT @XML = '<t>test</t>'

BEGIN TRY

     EXEC sys.sp_xml_preparedocument @hDoc OUTPUT, @XML 
     SELECT '"' + @XML + '" is valid'
     EXEC sys.sp_xml_removedocument @hDoc

END TRY
BEGIN CATCH
     SELECT '"' + @XML + '" is invalid'     
END CATCH

Output -

-------------------------
"<t>test</" is invalid

-------------------------
"<t>test</t>" is valid

Upvotes: 3

Question3CPO
Question3CPO

Reputation: 1202

In response solely to this:

Yes, but the beef of my question is how do I check is XML is valid in the first place?

Is your XML invalid in the sense that you'll get a break like the second line below this:

SELECT CAST('<xml>Yep this is xml</xml>' AS XML)
SELECT CAST('<xml>Nope, not xml</x' AS XML)

One solution I see is a row-by-row approach, where you try and CAST a row as XML, and if it casts as XML successfully insert the valid row into a table with valid XML values, if it won't cast correctly, the value isn't inserted. See this thread for examples.

Upvotes: 9

Christian Phillips
Christian Phillips

Reputation: 18749

If you need to keep the XML whether valid or invalid, you could have a flag on the table like IsValid (bit). The loading process could then do a one off check on this data to check if the XML is valid and set the flag based on the result. This way you can get the valid XML by using the query:

SELECT 
    CASE WHEN IsValid = 1 THEN CAST(xml_data as XML)
    ELSE null 
END

If you don't need to keep invalid XML, then just reject it during the loading, setting the field to NULL, so the query would look like:

SELECT 
    CASE WHEN xml_data IS NOT NULL THEN CAST(xml_data as XML)
    ELSE null 
END

Both ways will give better performance that passing through a SQL function for each query.

Upvotes: 0

Related Questions