Mike
Mike

Reputation: 24954

Parsing HTML with XPath/XMLHttpRequest

I'm trying to download an HTML page, and parse it using XMLHttpRequest(on the most recent Safari browser). Unfortunately, I can't get it to work!

var url = "http://google.com";

xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", url);

xmlhttp.onreadystatechange  = function(){
    if(xmlhttp.readyState==4){
        response = xmlhttp.responseText;
        var doc = new DOMParser().parseFromString(response, "text/xml");
        console.log(doc);
        var nodes = document.evaluate("//a/text()",doc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null);
        console.log(nodes);
        console.log(nodes.snapshotLength);
        for(var i =0; i<nodes.snapshotLength; i++){
            thisElement = nodes.snapshotItem(i);
            console.log(thisElement.nodeName);
        }
    }
};
xmlhttp.send(null);

The text gets downloaded successfully(response contains the valid HTML), and is parsed into a tree correctly(doc represents a valid DOM for the page). However, nodes.snapshotLength is 0, despite the fact that the query is valid and should have results. Any ideas on what's going wrong?

Upvotes: 1

Views: 2775

Answers (2)

Mic
Mic

Reputation: 25154

If you are using either:

  • a JS library or
  • you have a modern browser with the querySelectorAll method available (Safari is one)

You can try to use CSS selectors to parse the DOM instead of XPATH.

Upvotes: 1

John Saunders
John Saunders

Reputation: 161773

HTML is not XML. The two are not interchangeable. Unless the "HTML" is actually XHTML, you will not be able to use XPATH to process it.

Upvotes: 1

Related Questions