ashok
ashok

Reputation: 1268

Using XML namespace with XmlSlurper in Groovy - how to query path correctly?

I have the following sample xml:

<root>

<table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

<table xmlns:f="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

</root>
def slurper = new XmlSlurper().parseText(someXMLText)
def hNs = new groovy.xml.Namespace(
                    "http://www.w3.org/TR/html4/", 'h')
def fNs = new groovy.xml.Namespace(
                    "https://www.w3schools.com/furniture", 'h')
println slurper.root[hNs.table].tr.td //not giving any response

As there is two table tags having different tags. How to fetch Apples value under tag using gpath using namespace.

Upvotes: 1

Views: 822

Answers (1)

Szymon Stepniak
Szymon Stepniak

Reputation: 42184

Your usage of the XML document is incorrect. When you define a namespace like xmlns:h="http://www.w3.org/TR/html4/", you create a prefix that has to be used explicitly. Otherwise, you cannot query the document using this prefix if it is not assigned to any node. You would need to assign it to at least a table tag to make any use of it.

<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</h:table>

However, if you want to create a default namespace for each table node (and its children nodes), you need to skip the prefix and define a namespace without it.

<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

Spot the slight difference - in the second example, we defined namespace with xmlns attribute, not the xmlns:h one as in the previous case.

When you use default namespaces, you can use the declareNamespace method to define prefixes for the default namespaces. This allows you to use a selector like h:table that refers to a table tag in the namespace defined by h prefix in the declared namespaces map. Consider the following example:

def source = '''<root>

<table xmlns="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</table>

<table xmlns="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</table>

</root>'''

def root = new XmlSlurper().parseText(source).declareNamespace([
    h: "http://www.w3.org/TR/html4/", 
    f: "https://www.w3schools.com/furniture"
])

assert root."h:table".tr.td.first().text() == "Apples"
assert root."h:table".tr.td.last().text() == "Bananas"
assert root."f:table".width.toInteger() == 80

In this example, we use an XML document that defines two different default namespaces for table tags. With the declareNamespace method, we can define prefixes for those namespaces so we can use the prefix in the tag selector.

If, for some reason, you need to define namespace with the prefix at the table node level, you need to use this prefix at least at the top level.

def source = '''<root>

<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <tr>
    <td>Apples</td>
    <td>Bananas</td>
  </tr>
</h:table>

<f:table xmlns:f="https://www.w3schools.com/furniture">
  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>
</f:table>

</root>'''

def root = new XmlSlurper().parseText(source).declareNamespace([
    h: "http://www.w3.org/TR/html4/",
    f: "https://www.w3schools.com/furniture"
])

assert root."h:table".tr.td.first().text() == "Apples"
assert root."h:table".tr.td.last().text() == "Bananas"
assert root."f:table".width.toInteger() == 80

Hope it helps.

Upvotes: 2

Related Questions