aspiringCoder
aspiringCoder

Reputation: 415

Inserting CData XML parsing Java String

I have a java string containing XML. I want to read through this Java String wrap all the text nodes within CData, only I'm not sure how to do this. The reason for doing this is that the is a text node containing an angle bracket which is causing an exception when I try to parse the String. Can any1 help me out?

<node> this < is text <node> <node2> this is < text <node2>

I would like to know if there is an easy way of reading this text as a string with XMLReader and inserting CData around the text

thanks

Stefan

Upvotes: 1

Views: 15190

Answers (2)

andorker
andorker

Reputation: 1

Try this, it worked for me.
http://www.java2s.com/Code/Java/XML/AddingaCDATASectiontoaDOMDocument.htm

import java.io.File;

import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.CDATASection;
import org.w3c.dom.Document;
import org.w3c.dom.Element;

public class Main {
  public static void main(String[] argv) throws Exception {

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    factory.setValidating(true);

    factory.setExpandEntityReferences(false);

    Document doc = factory.newDocumentBuilder().parse(new File("filename"));
    Element element = doc.getElementById("key1");

    // Add a CDATA section to the root element
    element = doc.getDocumentElement();
    CDATASection cdata = doc.createCDATASection("data");
    element.appendChild(cdata);

  }
}

Upvotes: 0

Woot4Moo
Woot4Moo

Reputation: 24316

Perhaps something like this (apologies in advance for any inefficiency:

if(currentNode instanceof XMLNodeType.Text)  
{  
     String toWrite = String.format("<![CDATA[%s]]>", currentNode.getText());   
     // or whatever retrieves text of the node
}  

It looks like you need to massage the data to be valid XML. The process for this is of course highly dependent on your input. So essentially what occurs is you receive a big string that you need to convert into valid XML. The advantage here is that you can define a schema that the third party adheres to, this is a meeting with them so it is outside of the scope of discussion, but is worth mentioning. Once you have this schema defined you will know which nodes are considered "text" nodes and need to be wrapped in CDATA blocks.

The basic idea is this:

List<String> textTags = new ArrayList<String>();  
textTags.add("NODE");  
//other things to add
String bigAwfulString = inputFromThirdParty();   
String validXML = ""; 
for(String currentNode : bigAwfulString.split("yourRegexHere")  
{  
    if(textTags.contains(currentNode)  
    {  
           validXML+=String.format("<![CDATA[%s]]>", currentNode.getText());    
           continue;
    }   
    validXML+=currentNode;
}

Upvotes: 2

Related Questions