Reputation: 9
I'm trying to parse through a html document and store the urls in an array with php. If for example the source code of the document is:
Blah blah blah <a href="http://google.com">link</a> blab
<a href="http://yahoo.com">more links</a> ababasadsf
How do I find and grab the href attribute of the links and store each as an array element?
Upvotes: 1
Views: 1921
Reputation: 95374
Using phpQuery, you can traverse the DOM and find the anchors (<a>
) with the href
attribute defined:
$dom = phpQuery::newDocument($htmlSource);
$anchors = $dom->find('a[href]');
$urls = array();
if($anchors) {
foreach($anchors as $anchor) {
$anchor = pq($anchor);
$urls[] = $anchor->attr('href');
}
}
Upvotes: 3