Reputation: 1061
I have some valid and invalid xml values stored in a varchar column.
I'd like to cast the valid xml values to the actual xml data type and invalid to nulls.
What's a good way to do this ?
Something like:
SELECT
CASE WHEN dbo.isValidXML(xml_data) THEN CAST(xml_data as XML)
ELSE null
END
Upvotes: 13
Views: 45205
Reputation: 121902
Try to use sp_xml_preparedocument
-
SET NOCOUNT ON;
DECLARE @XML NVARCHAR(MAX)
SELECT @XML = '<t>test</'
DECLARE @hDoc INT
BEGIN TRY
EXEC sys.sp_xml_preparedocument @hDoc OUTPUT, @XML
SELECT '"' + @XML + '" is valid'
EXEC sys.sp_xml_removedocument @hDoc
END TRY
BEGIN CATCH
SELECT '"' + @XML + '" is invalid'
END CATCH
SELECT @XML = '<t>test</t>'
BEGIN TRY
EXEC sys.sp_xml_preparedocument @hDoc OUTPUT, @XML
SELECT '"' + @XML + '" is valid'
EXEC sys.sp_xml_removedocument @hDoc
END TRY
BEGIN CATCH
SELECT '"' + @XML + '" is invalid'
END CATCH
Output -
-------------------------
"<t>test</" is invalid
-------------------------
"<t>test</t>" is valid
Upvotes: 3
Reputation: 1202
In response solely to this:
Yes, but the beef of my question is how do I check is XML is valid in the first place?
Is your XML invalid in the sense that you'll get a break like the second line below this:
SELECT CAST('<xml>Yep this is xml</xml>' AS XML)
SELECT CAST('<xml>Nope, not xml</x' AS XML)
One solution I see is a row-by-row approach, where you try and CAST
a row as XML
, and if it casts as XML
successfully insert the valid row into a table with valid XML
values, if it won't cast correctly, the value isn't inserted. See this thread for examples.
Upvotes: 9
Reputation: 18749
If you need to keep the XML whether valid or invalid, you could have a flag on the table like IsValid (bit)
. The loading process could then do a one off check on this data to check if the XML is valid and set the flag based on the result. This way you can get the valid XML by using the query:
SELECT
CASE WHEN IsValid = 1 THEN CAST(xml_data as XML)
ELSE null
END
If you don't need to keep invalid XML, then just reject it during the loading, setting the field to NULL
, so the query would look like:
SELECT
CASE WHEN xml_data IS NOT NULL THEN CAST(xml_data as XML)
ELSE null
END
Both ways will give better performance that passing through a SQL function
for each query.
Upvotes: 0