mobolog66
mobolog66

Reputation: 3

Get a link from many matching links on a page?

I am grabbing the HTML of a remote page with file_get_contents(), and in this remote page there are loads of links I can grab with a $dom.

However the problem I'm having is the link I want contains a specific value '/vue/', and there are anywhere from 1 to 1000 links on the page with the same value in it. The /vue/ part is the only static element in the links.

I only need one of these links, it doesn't matter which one.
How would I go about grabbing just one link out of the huge number of them?

Here is the code I currently have to grab all the links:

     foreach($dom->getElementsByTagName('a') as $node) {

  if(strpos($node->getAttribute('href'),'/vue/') !== false) {

      $Epsiodes = $node->getAttribute('href')[0];

  }

  }

But $Epsiodes comes back blank.

Upvotes: 0

Views: 70

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

Using XPath (and DOMDocument::loadHTMLFile instead of file_get_contents) will be more straight forward to do that:

$dom = new DOMDocument;
$dom->loadHTMLFile($url);

$xp = new DOMXPath($dom);

$hrefNodeList = $xp->query('//a/@href[contains(., "/vue/")][1]');

if ($hrefNodeList->length)
    $result = $hrefNodeList->item(0)->nodeValue;

XPath query details:

//    # anywhere in the DOM tree
a     # "a" tag
/
@href # href attribute
[     # start a condition
  contains(., "/vue/")  # the current element `.` must contain `/vue/`
]     # close the condition
[1]   # only one item (the first)

Note that even with only one result DOMXPath::query returns always a nodelist (but with only one element)

Upvotes: 1

Related Questions