mrq
mrq

Reputation: 13

Get all XML nodes of given XSD type

i would like to get all XML nodes of given XSD type.

For example (see code snippets below)

Is there a java library, that can provide this functionality?

Or any ideas how to solve this manually? The XSD can be very complex, with imports other schemas, etc. I was thinking about generate all possible xPaths to nodes with given type by traversing XSD schema (there will be no recursion) and then apply them on XML file and check if some nodes are found.

XSD example

<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'
           xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>


  <xs:complexType name="ListA">
    <xs:sequence>
      <xs:element name="ItemA" type="ItemType" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>

  <xs:complexType name="ListB">
    <xs:sequence>
      <xs:element name="ItemB" type="ItemType" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType> 

  <xs:complexType name="AnotherList">
    <xs:sequence>
      <xs:element name="ItemA" type="CustomItemType" maxOccurs="unbounded"/>
      <xs:element name="ItemB" type="CustomItemType" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType> 

  <xs:complexType name="ItemType">
    <xs:sequence>
      <xs:element name="ID"  type="xs:string" />
      <xs:element name="Value" type="xs:string" />      
    </xs:sequence> 
  </xs:complexType> 

  <xs:complexType name="CustomItemType">
    <xs:sequence>
      <xs:element name="ID"  type="xs:string" />
      <xs:element name="Value" type="xs:string" />      
    </xs:sequence> 
  </xs:complexType>   

  <xs:element name="MyLists">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="MyListA" type="ListA" />  
        <xs:element name="MyListB" type="ListB" />
        <xs:element name="MyListC" type="AnotherList" />
      </xs:sequence>
    </xs:complexType>  
  </xs:element>  
</xs:schema>

XML example

<MyLists>
  <MyListA>
    <ItemA>
      <ID>1</ID>
      <Value>A1</Value>
    </ItemA>
    <ItemA>
      <ID>2</ID>
      <Value>A2</Value>
    </ItemA>
  </MyListA>
  <MyListB>
    <ItemB>
      <ID>1</ID>
      <Value>B1</Value>
    </ItemB>
    <ItemB>
      <ID>2</ID>
      <Value>B2</Value>
    </ItemB>
  </MyListB>
  <MyListC>
    <ItemA>
      <ID>1</ID>
      <Value>A1</Value>
    </ItemA>
    <ItemB>
      <ID>2</ID>
      <Value>B1</Value>
    </ItemB>
  </MyListC>
</MyLists>

Upvotes: 0

Views: 1413

Answers (1)

Martin Honnen
Martin Honnen

Reputation: 167516

You can solve that with schema-aware XPath 2.0 or later or schema-aware XQuery 1.0 or later by using a test like //element(*, YourGlobalTypeName) (https://www.w3.org/TR/xpath20/#prod-xpath-ElementTest), so with your sample the test //element(*, ListA) returns one element and //element(*, ItemType) returns four elements. In the Java world schema-aware XPath 2.0/3.0/3.1 and XQuery 1.0/3.0/3.1 is supported by Saxon 9 EE, there are also various XQuery implementations like exist-db or basex but I am not sure whether they support schema-aware XQuery.

Upvotes: 2

Related Questions