Reputation: 143
My question is about parsing XML where string values have HTML tags inside:
def xmlString = '''
<resource>
<string name="my_test">No problem here!</string>
<string name="my_text">
<b> <big>My bold and big title</big></b>
Rest of the text
</string>
</resource>
'''
(it's an Android resource file)
When I use an XmlSlurper, the HTML tags are removed. This code:
def resources = new XmlSlurper().parseText(xmlString )
resources.string.each { string ->
println "string name = " + string.@name + ", string value = " + string.text()
}
will produce
string name = my_test, string value = No problem here!
string name = my_text, string value = My bold and big title
Rest of the text
I could use CDATA to prevent the HTML tags to be parsed, but then these HTML tags will not be processed when the string my_text is used.
I also tried to use a StreamingMarkupBuilder, as explained in this SO answer : How to extract HTML Code from a XML File using groovy, but then only the HTML tags and the text between them is displayed:
<b><big>My bold and big title</big></b>
and the first string is not displayed. Thanks in advance!
Upvotes: 2
Views: 1846
Reputation:
def xmlString = '''
<resource>
<string name="my_test">No problem here!</string>
<string name="my_text">
<b><big>My bold and big title</big></b>
Rest of the text
</string>
</resource>
'''
def result = []
def resources = new XmlSlurper().parseText(xmlString).string
resources.each { resource ->
result << new groovy.xml.StreamingMarkupBuilder().bind { mkp.yield resource.getBody() }
}
Upvotes: 1