dildik
dildik

Reputation: 405

Java DOM parser returns null document

I have an HTML template which I want to read in:

<html>
   <head>
      <title>TEST</title>
   </head>
   <body>
      <h1 id="hey">Hello, World!</h1>
   </body>
</html>

I want find the tag with the id hey and then paste in new stuff (e.g. new tags). For this purpose I use the DOM parser. But my code returns me null:

public static void main(String[] args) {

    try {
        File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(file);
        doc.getDocumentElement().normalize();

        System.out.println(doc.getElementById("hey")); // returns null

    } catch (Exception e) {
        e.printStackTrace();
    }

}

What am I doing wrong?

Upvotes: 2

Views: 1712

Answers (2)

Vladislav Kysliy
Vladislav Kysliy

Reputation: 3736

I modified your example to using jsoup

public static void main(String[] args) {
        try {
            File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
            Document doc = Jsoup.parse(file, "UTF8");          
            Element elementById = doc.getElementById("hey");
            System.out.println("hey ="+doc.getElementById("hey").ownText());
            System.out.println("hey ="+doc.getElementById("hey"));

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

Upvotes: 2

Raffaele
Raffaele

Reputation: 20885

You are trying to parse a piece of XML with the Java XML API, that is very compliant with the XML specification and doesn't help the casual developer.

In XML an attribute named id is not automatically of ID type, and thus the XML implementation doesn't get it with .getElementById(). Either you use another library (Jsoup for example), or instruct the parser to treat id as an ID (via the DTD) or you use custom code.

Upvotes: 4

Related Questions