Adas
Adas

Reputation: 309

How to scrape page element using xpath

I want to get email of element using xpath

<td>
<span id="A-1_id_1151_1997" class="">[email protected]</span>
</td>

I have tried many codes and one of them is this

$html = new DOMDocument();
@$html->loadHtmlFile('http://www.deutsches-krankenhaus-verzeichnis.de/suche/Krankenhaus/260530089-00-1.1/Alexianer-Aachen-GmbH.jsf');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( '//*[@id="accordion"]/table[4]/tbody/tr[2]/td[7]' );
foreach ($nodelist as $n){
echo $n->nodeValue."\n";

if i use id then email is displaying but with TD tag its not, as page is dynamic and id changes on every page. i think the problem is with nodeValue but couldn't figure out.

Please provide any solution.

Upvotes: 0

Views: 358

Answers (1)

sideshowbarker
sideshowbarker

Reputation: 88408

Examining http://www.deutsches-krankenhaus-verzeichnis.de/suche/Krankenhaus/260530089-00-1.1/Alexianer-Aachen-GmbH.jsf it seems to me you can grab the nodes you want from that with something like the following XPath expression:

//table[*[@class="tablehead"]/td/*[text()="E-Mail"]]//tr[2]/td[7]

That is, translated in prose, ”Find any table that has a child with a class attribute whose value is tablehead and which in turn has a td child which in turn has any child whose text content is “E-Mail”—and if you find such a table, get the 7th td child of the 2nd tr descendant of it.”

If you want to get only any td that contains a specific e-mail address, you can just check that the text context of the entire node matches that particular e-mail address, and if you only want to get the first such matching node, use the [1] position predicate against the whole expression:

(//table[*[@class="tablehead"]/td/*[text()="E-Mail"]]//tr[2]/td[7][.="[email protected]"])[1]

Upvotes: 1

Related Questions