Reputation: 23
I have XML with this format:
<message>
<message_type_id>1</message_type_id>
<message_type_code>code1</message_type_code>
<version/>
<created_at>date1</created_at>
<payload>
<payment>
<document_id>id1</document_id>
<account_id>id2</account_id>
</payment>
</payload>
</message>
Branch inside payload is not defined. In one XML it can have one structure, in other XML - another.
As a result I want a dynamic array like this:
message_type_id: 1
message_type_code: code1
created_at: date1
document_id: id1
account_id: id2
Remember, that keys "document_id" and "account_id" can have another structure with different levels of embedding. In other words, I need to parse only leaves of each XML tree. And I don't know how these leaves are called, so constructions like
root.payload.payment.document_id
aren't useful.
I tried to solve this task with XmlSlurper, but didn't successed. How can I solve this task?
Upvotes: 0
Views: 422
Reputation: 171054
So given the Xml (here in a String)
def xmlText = '''<message>
<message_type_id>1</message_type_id>
<message_type_code>code1</message_type_code>
<version/>
<created_at>date1</created_at>
<payload>
<payment>
<document_id>id1</document_id>
<account_id>id2</account_id>
</payment>
</payload>
</message>'''
We can parse it with XmlParser
:
def message = new XmlParser().parseText(xmlText)
And then find all the nodes that are not empty, and only contain text (these are the leaf nodes). We can then use collectEntries
to make a map from them:
def map = message.'**'.findAll {
!it.children().empty && // Ignore empty leaves
it.children().every { it instanceof String } // Leaves only contain strings
}.collectEntries {
[it.name(), it.text()]
}
This assigns the following to the variable map
:
[
message_type_id: "1",
message_type_code: "code1",
created_at: "date1",
document_id: "id1",
account_id: "id2"
]
You will have issues if nodes have the same name however (as maps can only contain a single key)
Upvotes: 1