Reputation: 415
I have a java string containing XML. I want to read through this Java String wrap all the text nodes within CData, only I'm not sure how to do this. The reason for doing this is that the is a text node containing an angle bracket which is causing an exception when I try to parse the String. Can any1 help me out?
<node> this < is text <node> <node2> this is < text <node2>
I would like to know if there is an easy way of reading this text as a string with XMLReader and inserting CData around the text
thanks
Stefan
Upvotes: 1
Views: 15190
Reputation: 1
Try this, it worked for me.
http://www.java2s.com/Code/Java/XML/AddingaCDATASectiontoaDOMDocument.htm
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.CDATASection;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class Main {
public static void main(String[] argv) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setExpandEntityReferences(false);
Document doc = factory.newDocumentBuilder().parse(new File("filename"));
Element element = doc.getElementById("key1");
// Add a CDATA section to the root element
element = doc.getDocumentElement();
CDATASection cdata = doc.createCDATASection("data");
element.appendChild(cdata);
}
}
Upvotes: 0
Reputation: 24316
Perhaps something like this (apologies in advance for any inefficiency:
if(currentNode instanceof XMLNodeType.Text)
{
String toWrite = String.format("<![CDATA[%s]]>", currentNode.getText());
// or whatever retrieves text of the node
}
It looks like you need to massage the data to be valid XML. The process for this is of course highly dependent on your input. So essentially what occurs is you receive a big string that you need to convert into valid XML. The advantage here is that you can define a schema that the third party adheres to, this is a meeting with them so it is outside of the scope of discussion, but is worth mentioning. Once you have this schema defined you will know which nodes are considered "text" nodes and need to be wrapped in CDATA
blocks.
The basic idea is this:
List<String> textTags = new ArrayList<String>();
textTags.add("NODE");
//other things to add
String bigAwfulString = inputFromThirdParty();
String validXML = "";
for(String currentNode : bigAwfulString.split("yourRegexHere")
{
if(textTags.contains(currentNode)
{
validXML+=String.format("<![CDATA[%s]]>", currentNode.getText());
continue;
}
validXML+=currentNode;
}
Upvotes: 2