Florian Müller
Florian Müller

Reputation: 7785

What's wrong with this Java XML-Parsing code?

I'm trying to parse an XML file and be able to insert a path and get the value of the field.

It looks as follows:

import java.io.IOException;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.SAXException;

public class XMLConfigManager {
    private Element config = null;

    public XMLConfigManager(String file) {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        try {
            Document domTree;
            DocumentBuilder db = dbf.newDocumentBuilder();
            domTree = db.parse(file);
            config = domTree.getDocumentElement();
        }
        catch (IllegalArgumentException iae) {
            iae.printStackTrace();
        }
        catch (ParserConfigurationException pce) {
            pce.printStackTrace();
        }
        catch (SAXException se) {
            se.printStackTrace();
        }
        catch (IOException ioe) {
            ioe.printStackTrace();
        }
    }
    public String getStringValue(String path) {
        String[] pathArray = path.split("\\|");
        Element tempElement = config;
        NodeList tempNodeList = null;
        for (int i = 0; i < pathArray.length; i++) {
            if (i == 0) {
                if (tempElement.getNodeName().equals(pathArray[0])) {
                    System.out.println("First element is correct, do nothing here (just in next step)");
                }
                else {
                    return "**This node does not exist**";
                }
            }
            else {
                tempNodeList = tempElement.getChildNodes();
                tempElement = getChildElement(pathArray[i],tempNodeList);
            }
        }    
        return tempElement.getNodeValue();
    }
    private Element getChildElement(String identifier, NodeList nl) {
        String tempNodeName = null;
        for (int i = 0; i < nl.getLength(); i++) {
            tempNodeName = nl.item(i).getNodeName();
            if (tempNodeName.equals(identifier)) {
                Element returner = (Element)nl.item(i).getChildNodes();
                return returner;
            }
        }
        return null;
    }
}

The XML looks like this (for test purposes):

<?xml version="1.0" encoding="UTF-8"?>
<amc>
    <controller>
        <someOtherTest>bla</someOtherTest>
        <general>
            <spam>This is test return String</spam>
            <interval>1000</interval>
        </general>
    </controller>
    <agent>
        <name>test</name>
        <ifc>ifcTest</ifc>
    </agent>
</amc>

Now I can call the class like this

XMLConfigManager xmlcm = new XMLConfigManager("myConfig.xml");
System.out.println(xmlcm.getStringValue("amc|controller|general|spam"));

Here, I'm expecting the value of the tag spam, so this would be "This is test return String". But I'm getting null.

I've tried to fix this for days now and I just can't get it. The iteration works so it gets to the tag spam, but then, just as I said, it returns null instead of the text.

Is this a bug or am I just doing wrong? Why? :(

Thank you very much for help!

Regards, Flo

Upvotes: 1

Views: 288

Answers (4)

Eli Acherkan
Eli Acherkan

Reputation: 6411

As others mentioned before me, you seem to be reinventing the concept of XPath. You can replace your code with the following:

javax.xml.xpath.XPath xpath = javax.xml.xpath.XPathFactory.newInstance().newXPath();
String expression = "/amc/controller/general/spam";
org.xml.sax.InputSource inputSource = new org.xml.sax.InputSource("myConfig.xml");
String result = xpath.evaluate(expression, inputSource);

See also: XML Validation and XPath Evaluation in J2SE 5.0

EDIT:

An example of extracting a collection with XPath:

NodeList result = (NodeList) xpath.evaluate(expression, inputSource, XPathConstants.NODESET);
for (int i = 0; i < result.getLength(); i++) {
    System.out.println(result.item(i).getTextContent());
}

The javax.xml.xpath.XPath interface is documented here, and there are a few more examples in the aforementioned article.

In addition, there are third-party libraries for XML manipulation, which you may find more convenient, such as dom4j (suggested by duffymo) or JDOM. Regardless of which library you use, you can leverage the quite powerful XPath language.

Upvotes: 3

Dave Newton
Dave Newton

Reputation: 160261

Because you're using getNodeValue() rather than getTextContent().

Doing this by hand is an accident waiting to happen; either use the built-in XPath solutions, or a third-party library as suggested by @duffymo. This is not a situation where re-invention adds value, IMO.

Upvotes: 2

Jon Skeet
Jon Skeet

Reputation: 1502206

You're calling Node.getNodeValue() - which is documented to return null when you call it on an element. You should call getTextContent() instead - or use a higher level API, of course.

Upvotes: 4

duffymo
duffymo

Reputation: 308938

I'd wonder why you're not using a library like dom4j and built-in XPath. You're doing a lot of work with a very low-level API (WC3 DOM).

Step through with a debugger and see what children that <spam> node has. You should quickly figure out why it's null. It'll be faster than asking here.

Upvotes: 1

Related Questions