Reputation: 54
I have to consume a WS that sends its XML data inside a CDATA tag, the output I get is the following:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child1>
</parent>
I have to format this data to an usable XML so I can work with it.
It should look like:
<parent>
<child1>
<xmltag1>4 años < 8 </xmltag1>
<xmltag2>3 años < 12 </xmltag2>
<child>
</parent>
With various java functions like this one i havent got a decent output:
StringEscapeUtils.unescapeXml(string);
There could be a way of getting that result by using regex, so far I got this, but regex is not my strength:
string.replaceAll("<{0}>", "</{0}>");
Upvotes: 2
Views: 630
Reputation: 627082
You can use
String fixedXml = text.replaceAll("<(/?\\w+(?:\\s[^>]*)?>)", "<$1");
See the regex demo. Details:
<
- a <
string(/?\\w+(?:\\s[^>]*)?>)
- Group 1 ($1
):
/?
- an optional /
char\w+
- one or more word chars(?:\s[^>]*)?
- an optional sequence of a whitespace char and then any zero or more chars other than >
>
- a >
char.Upvotes: 1