Reputation: 4211
i am using php and i am having problem to parse the href from anchor tag with text.
example: anchor tag having test http://www.test.com
like this <a href="http://www.test.com" title="test">http://www.test.com</a>
i want to match all text in anchor tag
thanks in advance.
Upvotes: 1
Views: 3289
Reputation: 8382
Use DOM:
$text = '<a href="http://www.test.com" title="test">http://www.test.com</a> something else hello world';
$dom = new DOMDocument();
$dom->loadHTML($text);
foreach ($dom->getElementsByTagName('a') as $a) {
echo $a->textContent;
}
DOM is specifically designed to parse XML and HTML. It will be more robust than any regex solution you can come up with.
Upvotes: 6
Reputation: 3585
If you have already obtained the anchor tag you can extract the href attribute via a regex easily enough:
<a [^>]*href="([^"])"[^>]*>
If you instead want to extract the contents of the tag and you know what you are doing, it isn't too hard to write a simple recursive descent parser, using cascading regexes, that will parse all but the most pathological cases. Unfortunately PHP isn't a good language to learn how to do this, so I wouldn't recommend using this project to learn how.
So if it is the contents you are after, not the attribute, then @katrielalex is right: don't parse HTML with regex. You will run into a world of hurt with nested formatting tags and other legal HTML that isn't compatible with regular expressions.
Upvotes: -1
Reputation: 6186
Assuming you wish to select the link text of an anchor link with that href, then something like this should work...
$input = '<a href="http://www.test.com" title="test">http://www.test.com</a>';
$pattern = '#<a href="http://www\.test\.com"[^>]*>(.*?)</a>#';
if (preg_match($pattern, $input, $out)) {
echo $out[1];
}
This is technically not perfect (in theory > can probably be used in one of the tags), but will work in 99% of cases. As several of the comments have mentioned though, you should be using a DOM.
Upvotes: -1