Reputation: 1772
I'm having a problem matching the href section of a link using preg_match_all, currently it is capturing 3 sections (full link, url only, link text only) which is perfect but the url only part is capturing any other tags located after the href tag.
Also how do I make the "href" text case insensitive?
Code:
$content = '<a href="http://www.google.com" target="_blank">Google</a> is a search engine. <a href="http://www.yahoo.com" title="yahoo" target="_blank">Yahoo</a> is a search engine.';
preg_match_all('/<a href="([^<]*)">([^<]*)<\/a>/', $content, $matches);
print_r($matches);
Result:
Array
(
[0] => Array
(
[0] => <a href="http://www.google.com" target="_blank">Google</a>
[1] => <a href="http://www.yahoo.com" title="yahoo" target="_blank">Yahoo</a>
)
[1] => Array
(
[0] => http://www.google.com" target="_blank
[1] => http://www.yahoo.com" title="yahoo" target="_blank
)
[2] => Array
(
[0] => Google
[1] => Yahoo
)
)
Upvotes: 0
Views: 1980
Reputation: 21003
your starting out looking for the > and not taking in to account any other attributes. try
/<a href="([^"]*)"[^>]+>([^<]*)<\/a>/
this will now pull out the href, then skip over the rest of the attributes, and then pull the html right up the next tag
Upvotes: 2