Reputation: 3
I am grabbing the HTML of a remote page with file_get_contents()
, and in this remote page there are loads of links I can grab with a $dom
.
However the problem I'm having is the link I want contains a specific value '/vue/', and there are anywhere from 1 to 1000 links on the page with the same value in it. The /vue/
part is the only static element in the links.
I only need one of these links, it doesn't matter which one.
How would I go about grabbing just one link out of the huge number of them?
Here is the code I currently have to grab all the links:
foreach($dom->getElementsByTagName('a') as $node) {
if(strpos($node->getAttribute('href'),'/vue/') !== false) {
$Epsiodes = $node->getAttribute('href')[0];
}
}
But $Epsiodes
comes back blank.
Upvotes: 0
Views: 70
Reputation: 89547
Using XPath (and DOMDocument::loadHTMLFile
instead of file_get_contents
) will be more straight forward to do that:
$dom = new DOMDocument;
$dom->loadHTMLFile($url);
$xp = new DOMXPath($dom);
$hrefNodeList = $xp->query('//a/@href[contains(., "/vue/")][1]');
if ($hrefNodeList->length)
$result = $hrefNodeList->item(0)->nodeValue;
XPath query details:
// # anywhere in the DOM tree
a # "a" tag
/
@href # href attribute
[ # start a condition
contains(., "/vue/") # the current element `.` must contain `/vue/`
] # close the condition
[1] # only one item (the first)
Note that even with only one result DOMXPath::query
returns always a nodelist (but with only one element)
Upvotes: 1