Reputation: 233
I have a XML like below:
<content>
<p><b>Node:</b> Some information</p>
</content>
When deserializing this XML, I want to get the content inside p tag as a string.
For example, if I have a java class like below:
@Data
class Content {
TextInParagraph p;
}
@Data
class TextInParagraph {
String text;
}
I should have value of text as "<b>Node:</b> Some information
".
Is there a way I can do above using JAXB or Jackson XML parser?
I tried deserializing above in Jackson, but I am getting below exception:
Expected END_ELEMENT, got event of type 1
java.io.IOException: Expected END_ELEMENT, got event of type 1
Upvotes: 2
Views: 876
Reputation: 580
Sadly, this is not possible with jackson-dataformat-xml
.
With JAXB however you can solve this by using a DomHandler
@XmlRootElement(name = "content")
@XmlAccessorType(XmlAccessType.FIELD)
public class Content {
@XmlAnyElement(InnerXmlHandler.class)
private String p;
}
DomHandler
import javax.xml.bind.ValidationEventHandler;
import javax.xml.bind.annotation.DomHandler;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import java.io.StringReader;
import java.io.StringWriter;
public class InnerXmlHandler implements DomHandler<String, StreamResult> {
private static final String START_TAG = "<p>";
private static final String END_TAG = "</p>";
private StringWriter xmlWriter = new StringWriter();
public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
return new StreamResult(xmlWriter);
}
public String getElement(StreamResult rt) {
String xml = rt.getWriter().toString();
int beginIndex = xml.indexOf(START_TAG) + START_TAG.length();
int endIndex = xml.lastIndexOf(END_TAG);
return xml.substring(beginIndex, endIndex);
}
public Source marshal(String n, ValidationEventHandler errorHandler) {
try {
String xml = START_TAG + n.trim() + END_TAG;
StringReader xmlReader = new StringReader(xml);
return new StreamSource(xmlReader);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
This works with the sample you provided, but even works with nested <p>
tags like:
<content>
<p> This is some <ul><li>list</li></ul> and <p>nested paragraph</p></p>
</content>
However, this works only when the inner HTML/XML is valid. The following will not work and throw an exception like The element type "ul" must be terminated by the matching end-tag "</ul>"
.
<content>
<p> This is some <ul>invalid xml </p>
</content>
This is because of JAXBs internals which traverses all inner elements although the dom handler is provided.
Upvotes: 2