Dilon Perera
Dilon Perera

Reputation: 73

How to just get urls inside a image anchor tag using SimpleHTMLDOM

I use SimpleHTMLDOM to grab stuff from other web page but i have a problem how to just get urls inside of image ancor tag because that web page consists linking anchor tags as well as image anchor tags ! but i just want to get href value in side the image anchor tag !

<a href="I DO NOT NEED THIS VALUE"><a/>


<a href="I NEED THIS VALUE"><img src="xxxx"><a/>

but when call for the DOM its returns all the href URLs including linking anchor URLs ! I just need the URLs inside image anchor tag !

i use this code to call..

$hrefl = $html->find('a');

$count = 1;

for( $i = 0; $i < 50; $i++){

              echo $hrefl[$count]->href;
              $count++;

 }

Upvotes: 0

Views: 1787

Answers (3)

hakre
hakre

Reputation: 198117

You need the href attribute of every link that contains an image tag. With xpath it's quite simple:

//a/img/../@href

You wrote that you use DOM, your code looks like written with simple html dom. That library is limited and nowadays not needed any longer because PHP has the DOMDocument and DOMXPath objects. I think simple html DOM has no xpath,

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$hrefs = $xpath->query('//a/img/../@href');
$count = $hrefs->length;
foreach($hrefs as $href)
{
    echo $href->nodeValue, "\n";
}

Demo

Upvotes: 4

Daxcode
Daxcode

Reputation: 434

try this:

$hrefl = $html->find('a');

$count = 1;

for( $i = 0; $i < 50; $i++){
  $img = $hrefl[$count]->find('img');
  // check if var exists and is valid               
  if ($img ... ) { 
    echo $hrefl[$count]->href;
  }
  $count++;
 }

Upvotes: 3

swapnilsarwe
swapnilsarwe

Reputation: 1300

Probably you are using simplehtmldom library for the parsing purpose I am not very much aware of it, I use DOMDocument for all my parsing purpose.

Very quick solution which I can suggest, is check whether the anchor tag has the image inside it, if yes get the value, otherwise skip it.

Something like this:

<?php
    $doc = new DOMDocument();
    @$doc->loadHTMLFile($urlofhtmlpage);

    foreach($doc->getElementsByTagName('a') as $a){
        foreach($a->getElementsByTagName('img') as $img){
            echo $a->getAttribute('href');
        }
    }
?>

Upvotes: 4

Related Questions