Vinay
Vinay

Reputation: 6879

Null value returnred with XML parsing in java using plane DOM parser

I was trying to parse a collada(.dae) file in java using plane DOM parser. When I try to get value it returns me null. I tried with answers and suggestions from other discussions but was not a success. The code I used is below.

for(int k1=0;k1<meshlist.getLength();k1++) {
    Element geometryItr1 = (Element)geometrylist.item(k);

    NodeList trianglelist = geometryItr1.getElementsByTagName("triangles");

    //System.out.println("Triangles length is " + trianglelist.getLength());     

        for(int o=0;o<trianglelist.getLength();o++) {

            Element trichildnodes = (Element) trianglelist.item(o);
            NodeList inputs = trichildnodes.getElementsByTagName("input");
        NodeList p = trichildnodes.getElementsByTagName("p");
        Element ppp = (Element) p.item(0);
        System.out.println("Node Value " + ppp.getNodeValue());
        System.out.println(inputs.getLength() + "Input length");

        for(int in=0;in<inputs.getLength();in++) {

            Element inn = (Element) inputs.item(in);
            System.out.println(inn.getAttribute("semantic") + " " + inn.getAttribute("source") + " Attributes");

        }


        //System.out.println(p.getLength() +  " P's length" );
        //System.out.println("P's content " + ppp.getFirstChild().getNodeValue());


    }   
}

The XML is very large and I am posting a part which I was trying to parse.

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  <triangles>
</mesh>

I was trying to get the value of <p>. Everything works fine except getting p's value. But when I debug I can see the values, its associated with first child. I even tried using firstChild. I am completely lost with parsing trying to find out a solution on this. Please some one help me find a solution on How to get the value of p ?

When I use getTextContent instead I get the output like below:

NodeValue null
NodeValue 24 262 2 72 72 72 72 2222 8198219
NodeValue null

The output is blank for two tags.

Upvotes: 0

Views: 4364

Answers (3)

bdoughan
bdoughan

Reputation: 149047

I would recommend using the javax.xml.xpath APIs available in the JDK/JRE since Java SE 5 to make the processing of your XML document easier:

package forum11688757;

import java.io.File;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;

public class Demo {

    public static void main(String[] args) throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(new File("src/forum11688757/input.xml"));

        XPathFactory xpf = XPathFactory.newInstance();
        XPath xpath = xpf.newXPath();
        NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);
        for(int x=0; x<nodeList.getLength(); x++) {
            System.out.println(nodeList.item(x).getTextContent());
        }
    }

}

input.xml

<mesh>
  <source> </source>
  <source> </source>
  <source> </source>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
  <triangles>
    <input />
    <input />
    <input />
    <p> 24 262 2 72 72 72 72 2222 8198219  </p>
  </triangles>
</mesh>

Output

 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  
 24 262 2 72 72 72 72 2222 8198219  

UPDATE

You could also get the p elements using the following line of code. You need to be careful though since it will get all p elements not just those in the /mesh/triangles/p path

NodeList nodeList = document.getElementsByTagName("p");

The following approach will always get you the data you are looking for, even if p eleements are later added somewhere else in the document.

NodeList nodeList = (NodeList) xpath.evaluate("/mesh/triangles/p", document, XPathConstants.NODESET);

Upvotes: 3

alain.janinm
alain.janinm

Reputation: 20065

You don't have to iterate over the previous nodes if you don't need them. For example it's how to print all the text content in <p> tags :

    File xmlPath = new File("test.xml");

    DocumentBuilderFactory fabrique = DocumentBuilderFactory.newInstance();
    fabrique.setCoalescing(true);
    fabrique.setIgnoringElementContentWhitespace(true);

    DocumentBuilder constructeur = fabrique.newDocumentBuilder();

    Document document = constructeur.parse(xmlPath);  
    document.setXmlVersion("1.0");
    Element racine = document.getDocumentElement();
    NodeList liste = racine.getElementsByTagName("p");

    for(int i=0; i<liste.getLength(); i++) {
        Element e = (Element)liste.item(i);  
        System.out.println(e.getFirstChild().getTextContent());
    }

You can use that and elaborate to obtain what you want I guess. If you want the attribute value, just use: e.getAttribute("att_name").

Upvotes: 1

parsifal
parsifal

Reputation: 11

The nodeValue() of an Element is documented as being null.

Instead, you probably want to call getTextContent(). But beware that it has its own idiosyncrasies (if you call it on the root of a tree, it will concatenate the text of all elements in the tree).

Upvotes: 1

Related Questions