Reputation: 788
I have an XML document which is valid against an XML schema. The XML schema has group elements (xs:group). These groups are composed of other defined elements. How can I write an XPath expression which will give me all the members of a specified group?
Any ideas?
@Steve:
assume that my xml schema has defined 4 elements (elem1, elem2, elem3, elem4). in addition, 2 groups are defined as follows:
group1: (elem1 | elem2 | elem3)
group2: (elem1 | elem4)
I hope you know some regular expressions. if no, then 'group2: (elem1 | elem4)' simply means group2 consists of EITHER an elem1 OR an elem4.
My question is if I have an xml document like:
<elem1/>
<elem2/>
<elem3/>
<elem4/>
<elem2/>
<elem1/>
<elem3/>
How can I list the elements in that document which belong to, say, group1
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element ref="elem0"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:group ref="A1"/>
<xs:group ref="A2"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="elem0" type="xs:string"/>
<xs:group name="A1">
<xs:choice>
<xs:element ref="elem10"/>
<xs:element ref="elem11"/>
</xs:choice>
</xs:group>
<xs:element name="elem10" type="xs:string"/>
<xs:element name="elem11" type="xs:string"/>
<xs:group name="A2">
<xs:choice>
<xs:element ref="elem20"/>
<xs:element ref="elem21"/>
<xs:element ref="elem22"/>
<xs:element ref="elem23"/>
</xs:choice>
</xs:group>
<xs:group name="CE">
<xs:choice>
<xs:element ref="elem30"/>
<xs:element ref="elem31"/>
<xs:element ref="elem32"/>
</xs:choice>
</xs:group>
<xs:group name="E">
<xs:choice>
<xs:element ref="elem30"/>
<xs:element ref="elem40"/>
</xs:choice>
</xs:group>
<xs:element name="elem20">
<xs:complexType>
<xs:sequence>
<xs:group minOccurs="2" maxOccurs="unbounded" ref="CE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="elem21">
<xs:complexType>
<xs:sequence>
<xs:group minOccurs="2" maxOccurs="2" ref="CE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="elem22">
<xs:complexType>
<xs:sequence>
<xs:element ref="elem40"/>
<xs:group ref="CE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="elem23">
<xs:complexType>
<xs:sequence>
<xs:element ref="elem40"/>
<xs:element ref="elem40"/>
</xs:sequence>
<!-- <xs:attribute name="prop" use="required" type="xs:NMTOKEN"/> -->
</xs:complexType>
</xs:element>
<xs:element name="elem31">
<xs:complexType>
<xs:sequence>
<xs:group minOccurs="0" maxOccurs="unbounded" ref="CE"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="elem32">
<xs:complexType>
<xs:sequence>
<xs:group ref="CE"/>
</xs:sequence>
<!-- <xs:attribute name="prop" use="required"/> -->
</xs:complexType>
</xs:element>
<xs:element name="elem30">
<xs:complexType>
<xs:attribute name="name" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="elem40">
<xs:complexType>
<xs:attribute name="name" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
Upvotes: 1
Views: 4549
Reputation: 31652
OK... there are a few things I think we need to clarify in your example - because, while they may seem like small points, in reality they aren't - and, if you're following the rules, it should be straight-forward how to construct the XPath expression (I'll show examples of how to construct basic XPath expressions taking groups into account for valid schemas, then what the problem I have with your example is).
Let's take it in steps.
First, let's assume that you have a schema which looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:group ref="group1"/>
<xs:group ref="group2"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:group name="group1">
<xs:sequence>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
<xs:element name="elem3"/>
</xs:sequence>
</xs:group>
<xs:group name="group2">
<xs:sequence>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
</xs:sequence>
</xs:group>
</xs:schema>
In this case, the important thing to notice is that we have a sequence of group1
followed by group2
both of which are sequences of elements.
With a sequence, (and no minoccurs='0'
attribute on any of the group
elements - which would be invalid anyways as I'll explain later), selecting the required elements is trivial.
To select all elements of group1
we might simply use the following XPath:
/root/(elem1[1]|elem2[1]|elem3)
This works because we know the resulting XML will always be:
<root>
<elem1 />
<elem2 />
<elem3 />
<elem1 />
<elem2 />
</root>
So, that's fine. We can select always the first elem1
, first elem2
and the elem3
.
Lets assume that instead of those groups containing sequences, that instead they contained choices. With the schema looking like this:
(This is more along the lines of the schema you put in your example, where "group2 consists of EITHER an elem1 OR an elem4.")
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:group ref="group1"/>
<xs:group ref="group2"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:group name="group1">
<xs:choice>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
<xs:element name="elem3"/>
</xs:choice>
</xs:group>
<xs:group name="group2">
<xs:choice>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
</xs:choice>
</xs:group>
</xs:schema>
In this case, the XPath is still trivial to construct, because we know there will only be two elements, and the first will belong to group1
and the second will belong to group2
, like so:
<root>
<elem2 />
<elem1 />
</root>
So the group1
XPath is even simpler:
/root/*[1]
Here is where it might get confusing - and where, I believe, the source of your confusion comes in.
In your example, you basically have suggested the following schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:group ref="group1" maxOccurs="unbounded"/>
<xs:group ref="group2" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:group name="group1">
<xs:choice>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
<xs:element name="elem3"/>
</xs:choice>
</xs:group>
<xs:group name="group2">
<xs:choice>
<xs:element name="elem1"/>
<xs:element name="elem2"/>
</xs:choice>
</xs:group>
</xs:schema>
This schema is invalid. (Notice the addition of the maxOccurs="unbounded"
attribute on the groups). This is similar to your example where you show multiple elements from one group occurring).
Why? Well, because it creates a potential ambiguity in the resulting XML.
For example, how should we parse the following XML instance:
<root>
<elem2 />
<elem1 />
<elem1 />
<elem2 />
</root>
Is that:
group1
, group1
, group1
, group1
group1
, group1
, group1
, group2
group1
, group1
, group2
, group1
group1
, group2
, group1
, group1
We just don't know.
But the designers of XML Schemas thought about that and made a rule for it:
http://en.wikipedia.org/wiki/Unique_Particle_Attribution
And your hypothetical schema violates that rule.
Now, v1.1 does make some improvements in this area... however, there are still situations where you can easily create similar ambiguities.
In your example, if no elements 3 or 4 are present in the xml, it's really quite impossible to tell where group1 ends and group2 begins.
Now, if all you want to do is select elements with a particular name, you can do that easily:
/root/(elementName1|elementName2|elementName3)
would select all elements under root
with the names elementName1
or elementName2
or elementName3
.
So, in your example, something like: (elem1|elem2|elem3)
would be just fine.
But, that's not what you asked. What you asked was about selecting by group - and the example you provided made it impossible to give you a real answer for by group.
If you have a real, valid schema, and you need help constructing the XPath, please paste that schema, and I'll be happy to help.
Upvotes: 4