Ben G
Ben G

Reputation: 26761

Perplexed by simple XPath bug

<?php
$response = 
'<style><div id="subhead"></div></style>';
//echo $response;

$doc = new DOMDocument();

$doc->loadHTML($response);  

$finder = new DomXPath($doc);

$term_select = $finder->query('//div[@id="subhead"]');

var_dump($term_select->item(0));

?>

The var_dump gets NULL, and I also get this Warning on line 8:

Warning: DOMDocument::loadHTML(): Unexpected end tag : div in Entity, line: 1 on line 8

Note that this is not my HTML (I'm scraping), so changing the HTML is not an option.

Upvotes: 1

Views: 187

Answers (2)

Francois Deschenes
Francois Deschenes

Reputation: 24969

The problem is that you can't have a DIV element instead a STYLE one so when you use loadHTML, it fails to validate the HTML. If you did a $doc->saveHTML(); you would have quickly realized that it's wrapping the <div id="subhead"> in CDATA.

To solve the problem, use loadXML() instead.

$doc->loadXml($response);

Upvotes: 2

rid
rid

Reputation: 63442

loadHTML() expects to find HTML in the string, but that is not valid HTML, so the string does not get loaded properly. XPath will not have that <div> element to get to. Try loadXML() instead.

Upvotes: 2

Related Questions