ro ra
ro ra

Reputation: 419

Groovy: parsing XML and storing attribute values to variables

I have to read an XML and store the attribute values into a variable. As I am new to Groovy I am getting it difficult. So far this is what I have done.

PS: This is what i am doing inside a method.

XmlParser parser = new XmlParser()
//def xmldata1 = parser.parse (new FileInputStream("c:\\temp\\create.xml"))

def xml = """<record_change>
<record_id>04707317-28e3-40cb-a227-15d2c91e08c2</record_id> 
<incident number="201507-001" status="open"/> 
</record_change>"""

def xmldata1 = parser.parseText(xml)
def n_id = xmldata1.record_id.?????
def n_number = ????? //expecting incident number
def m_status = ????? // expecting incident status

Could someone help me?

Upvotes: 1

Views: 3919

Answers (2)

Mark Fisher
Mark Fisher

Reputation: 9886

The short answer is the following:

def n_id     = xmldata1.record_id[0].text()
def n_number = xmldata1.incident[0].@number
def m_status = xmldata1.incident[0].@status

Note this differs slightly from Jayan's answer because of the indexing of the nodelist to an individual node. In your case it's not a problem for n_id, but you'll need the indices for n_number and m_status otherwise they are lists. Also, if you get data where there's more than one Node in the NodeList, you'll get problems. See the text() example later in my answer.

For this type of investigation, your best bet is using groovysh, and then directly interrogating the objects.

Here's a session showing this in action:

groovy:000> p = new XmlParser()
===> groovy.util.XmlParser@292ea3d5

groovy:000> xml = '<record_change><record_id>04707317-28e3-40cb-a227-15d2c91e08c2</record_id><incident number="201507-001" status="open"/></record_change>'
===> <record_change><record_id>04707317-28e3-40cb-a227-15d2c91e08c2</record_id><incident number="201507-001" status="open"/></record_change>

groovy:000> d = p.parseText(xml)
===> record_change[attributes={}; value=[record_id[attributes={}; value=[04707317-28e3-40cb-a227-15d2c91e08c2]], incident[attributes={number=201507-001, status=open}; value=[]]]]

Now we have the parsed xml, and it's in a local variable d. You can already see from the toString() some of the parts it has broken it up into.

The returned d is a groovy.util.Node:

groovy:000> d.getClass()
===> class groovy.util.Node

d.record_id is a groovy.util.NodeList:

groovy:000> d.record_id.getClass()
===> class groovy.util.NodeList

A NodeList has a text() method on it, which "Returns the textual representation of the current node and all its child nodes." There should be ringing bells now. The last part says "... and all its child nodes". There's also a text() method on Node, so if you want to be certain not to get additional children, then index the NodeList:

groovy:000> d.record_id[0].text()
===> 04707317-28e3-40cb-a227-15d2c91e08c2

If you had something like:

<record_change>
    <record_id>1111</record_id>
    <record_id>2222</record_id>
</record_change>

Then calling text() on just record_id would give:

groovy:000> d.record_id.text()
===> 11112222

which probably isn't what you want. Be very careful if you're working on a Node or a NodeList. If you know you only have one node in the list, then you'll be fine.

For the incident, you find it also is a NodeList, but obviously has no text part, just attributes, as we can see from the toString() of it:

 groovy:000> d.incident
 ===> [incident[attributes={number=201507-001, status=open}; value=[]]]

The text() method returns the part in value=[]. But here we have attributes which are accessible using the @attributeName syntax. First, we can also use the function attributes() to interrogate their names and values (it returns a regular Map). This is a function on Node, not NodeList, so we have to index the NodeList with d.incident[0] to get the Node from the NodeList. Thus we have:

groovy:000> d.incident[0].attributes()
===> [number:201507-001, status:open]

groovy:000> d.incident[0].attributes().keySet()
===> [number, status]

Directly, you can get the attributes like this:

groovy:000> d.incident.@number
===> [201507-001]

Ah! we have invoked @number on the NodeList, so it returns a List of the values (a single one in your case, but if you have more complex data it'll be multiple items). If you want the individual value on its own, index the appropriate element of the NodeList as previously done:

groovy:000> d.incident[0].@number
===> 201507-001

So be careful with indexing to get the individual Node element.

Finally, if you are referencing elements with hyphens in them, you can use quotes around it to get a value, e.g. d.'some-element'[0].text().

Upvotes: 4

Jayan
Jayan

Reputation: 18459

  • Use "@attributename" for getting attribute values
  • Use "element" name to get to element.

Here is your example:

XmlParser parser = new XmlParser()
//def xmldata1 = parser.parse (new FileInputStream("c:\\temp\\create.xml"))

def xml = """<record_change>
<record_id>04707317-28e3-40cb-a227-15d2c91e08c2</record_id>
<incident number="201507-001" status="open"/>
</record_change>"""

def xmldata1 = parser.parseText(xml) ;
def n_id = xmldata1.record_id.text()
def n_number = xmldata1.'incident'.@number
def m_status    = xmldata1.incident.@status

assert n_id == '04707317-28e3-40cb-a227-15d2c91e08c2'


println(n_number)
println(m_status)

Upvotes: 2

Related Questions