PaulG
PaulG

Reputation: 7142

Blackberry: How to get raw content of Node when parsing XML

I am trying to pull an html string out of some XML returned through a SOAP Web Service call. My Node object is of the following class:

org.w3c.dom.Node

Here is a code sample of the loop I use to go through nodes:

for(int t = 0; t < elements; t++)
{

         Element myElement = (Element)elements.item(t);

         NodeList childNodes = myElement.getChildNodes();
         int numChildren = childNodes.getLength();

         for(int counter = 0; counter < numChildren; counter++)
         {
             Node currentNode = childNodes.item(counter);
             NodeList currentNodeChildNodes = currentNode.getChildNodes();

             int numCurrentNodeChildren = currentNodeChildNodes.getLength();
             Node firstChild = currentNodeChildNodes.item( 0 );
         }
}

Now, some of these Nodes contain raw html. Which of course makes it look like they have children. I would like to take these html Nodes and get all of its' contents straight into a String. I tried currentNode.getTextContent() and it just produces a java.lang.NullPointerException.

Is there a method I can use to just take the node and get it's raw content as a String, regardless of whether or not it contains child nodes?

EDIT: Here's an example of the XML with html content

<?xml version="1.0" encoding="utf-16"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <GetInfoResponse xmlns="http://www.mycompany.com/">
      <GetInfoResult>
        <infoList>
          <Info>
            <iso>US</iso>
            <country_name>United States</country_name>
            <title>This is the title</title>
            <html_string><strong>NEWS</strong><h1>This is a section header</h1><p>Here is some information</p></html_string>
            <last_update_date>2013-01-01 00:00:00</last_update_date>
          </Info>
        </infoList>
        <faultResponse>
          <faultOccurred>boolean</faultOccurred>
          <faultDescription>string</faultDescription>
        </faultResponse>
      </GetInfoResult>
    </GetInfoResponse>
  </soap:Body>
</soap:Envelope>

Upvotes: 0

Views: 319

Answers (1)

jtahlborn
jtahlborn

Reputation: 53694

It's generally a bad idea to mix html and xml content. while html can be formatted like xml (xhtml), it quite often is not. by mixing the two, you run the risk of causing xml parsing failures in the future when your html does not happen to be valid xml. instead, you should should encode your html content as a valid xml element value. if you do this, then you can get the data in java using a Node.getTextContent() call on the html_string element.

Upvotes: 2

Related Questions