Justin Rusbatch
Justin Rusbatch

Reputation: 4052

Validating individual XML elements in SQL Server 2008R2

I'm writing a stored procedure to process XML data uploaded by the user:

<People>
    <Person Id="1" FirstName="..." LastName="..." />
    <Person Id="2" FirstName="..." LastName="..." />
    <Person Id="3" FirstName="..." LastName="..." />
    <Person Id="4" FirstName="..." LastName="..." />
    <Person Id="5" FirstName="..." LastName="..." />
</People>

I would like to use a schema to make sure that the entities are valid, but I don't want the entire process to fail just because of one invalid entity. Instead, I would like to log all invalid entities to a table and process the valid entities as normal.

Is there a recommended way to do this?

Upvotes: 0

Views: 654

Answers (1)

GSerg
GSerg

Reputation: 78210

A pure SQL approach would be:

  1. Create a schema collection that defines <Person>:

    CREATE XML SCHEMA COLLECTION [dbo].[testtest] AS
    N'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xs:element name="Person">
          <xs:complexType>
            <xs:attribute name="Id" type="xs:int" use="required"/>
            <xs:attribute name="FirstName" type="xs:string" use="required"/>
            <xs:attribute name="LastName" type="xs:string" use="required"/>
          </xs:complexType>
        </xs:element>
      </xs:schema>
     '
    

    (one-time operation)

  2. Have an XML query that selects each <Person> node from <People> as a separate row.

  3. Declare a cursor on that query and select each row into an untyped xml variable. After the select, try to assign to a typed xml variable from within a try-catch block.

Resulting code would look like:

declare @source xml = N'
    <People>
        <Person Id="1" FirstName="..." LastName="..." />
        <Person Id="2" FirstName="..." LastName="..." />
        <Person Id="f" FirstName="..." LastName="..." />
        <Person Id="4" FirstName="..." LastName="..." />
        <Person Id="5" FirstName="..." LastName="..." />
    </People>';

declare foo cursor
local
forward_only
read_only
for
  select t.p.query('.')
  from @source.nodes('People/Person') as t(p)
;

declare @x xml (dbo.testtest);
declare @x_raw xml;

open foo;

fetch next from foo into @x_raw;

while @@fetch_status = 0
begin
  begin try
    set @x = @x_raw;
    print cast(@x_raw as nvarchar(max)) + ': OK';
  end try
  begin catch
    print cast(@x_raw as nvarchar(max)) + ': FAILED';
  end catch;

  fetch next from foo into @x_raw;
end;

close foo;
deallocate foo;

Result:

<Person Id="1" FirstName="..." LastName="..."/>: OK
<Person Id="2" FirstName="..." LastName="..."/>: OK
<Person Id="f" FirstName="..." LastName="..."/>: FAILED
<Person Id="4" FirstName="..." LastName="..."/>: OK
<Person Id="5" FirstName="..." LastName="..."/>: OK


A simpler option is to create a CLR stored procedure that would parse XML in a .NET language.

Upvotes: 1

Related Questions