Reputation: 73
I use SimpleHTMLDOM to grab stuff from other web page but i have a problem how to just get urls inside of image ancor tag because that web page consists linking anchor tags as well as image anchor tags ! but i just want to get href value in side the image anchor tag !
<a href="I DO NOT NEED THIS VALUE"><a/>
<a href="I NEED THIS VALUE"><img src="xxxx"><a/>
but when call for the DOM its returns all the href URLs including linking anchor URLs ! I just need the URLs inside image anchor tag !
i use this code to call..
$hrefl = $html->find('a');
$count = 1;
for( $i = 0; $i < 50; $i++){
echo $hrefl[$count]->href;
$count++;
}
Upvotes: 0
Views: 1787
Reputation: 198117
You need the href attribute of every link that contains an image tag. With xpath it's quite simple:
//a/img/../@href
You wrote that you use DOM, your code looks like written with simple html dom. That library is limited and nowadays not needed any longer because PHP has the DOMDocument
and DOMXPath
objects. I think simple html DOM has no xpath,
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$hrefs = $xpath->query('//a/img/../@href');
$count = $hrefs->length;
foreach($hrefs as $href)
{
echo $href->nodeValue, "\n";
}
Upvotes: 4
Reputation: 434
try this:
$hrefl = $html->find('a');
$count = 1;
for( $i = 0; $i < 50; $i++){
$img = $hrefl[$count]->find('img');
// check if var exists and is valid
if ($img ... ) {
echo $hrefl[$count]->href;
}
$count++;
}
Upvotes: 3
Reputation: 1300
Probably you are using simplehtmldom library for the parsing purpose I am not very much aware of it, I use DOMDocument for all my parsing purpose.
Very quick solution which I can suggest, is check whether the anchor tag has the image inside it, if yes get the value, otherwise skip it.
Something like this:
<?php
$doc = new DOMDocument();
@$doc->loadHTMLFile($urlofhtmlpage);
foreach($doc->getElementsByTagName('a') as $a){
foreach($a->getElementsByTagName('img') as $img){
echo $a->getAttribute('href');
}
}
?>
Upvotes: 4