Reputation: 5039
So I got a varbinary(max)
column in SQL Server 2005 and it's full of XML. A few records somewhere have truncated XML so they're invalid.
This means if I run a
SELECT CAST(myVarbinaryColumn as XML) ...
it blows chunks.
How can I filter out/skip invalid xml ?
When I've done similar with a varchar that supposedly has dates I could use ISDATE(blah) = 1
. So an equivalent ISVALIDXML()
would be nice.
Please don't comment about "Why isn't the column XML datatype anyway.." This happened in the past and I don't have a time machine.
Upvotes: 7
Views: 793
Reputation: 135938
I think your best bet would be to write a custom CLR function, perhaps using XmlDocument.Load. In the CLR you could trap the error on a failed load and return an appropriate result.
EDIT: The code below would also work although it's not as elegant as a UDF. Unfortunately, we can't use TRY/CATCH in a UDF.
create procedure dbo.usp_IsValidXML(@XMLCandidate varbinary(max), @Return bit output)
as
begin
declare @x xml
begin try
set @x = cast(@XMLCandidate as xml)
set @Return = 1
end try
begin catch
set @Return = 0
end catch
end
go
declare @test1 varbinary(max)
declare @test2 varbinary(max)
set @test1 = cast('<data>asdf</data>' as varbinary(max))
set @test2 = cast('<data>asdf</da' as varbinary(max))
declare @IsValid bit
exec dbo.usp_IsValidXML @test1, @IsValid output
select @IsValid
exec dbo.usp_IsValidXML @test2, @IsValid output
select @IsValid
drop procedure dbo.usp_IsValidXML
Upvotes: 1
Reputation: 755491
I wish I had that IsValidXML()
function, too..... unfortunately, I don't think there's anything like that.....
Just a thought: is there anything you could check for to filter out the invalid XML??
E.g. could you filter out all those strings that don't end in .....</data>
or something like that?? (seeing that you say your invalid XML is typically truncated XML, so I would think the closing tag - </data>
or whatever - would be missing in those cases).
Upvotes: 1