zuk1
zuk1

Reputation: 18369

Get text from all <a> tags in string

Since I am completely useless at regex and this has been bugging me for the past half an hour, I think I'll post this up here as it's probably quite simple.

<a href="/folder/files/hey/">hey.exe</a>
<a href="/folder/files/hey2/">hey2.dll</a>
<a href="/folder/files/pomp/">pomp.jpg</a>

In PHP I need to extract what's between the <a> tags example:

hey.exe
hey2.dll
pomp.jpg

Upvotes: 3

Views: 1079

Answers (4)

Luc Touraille
Luc Touraille

Reputation: 82041

Here is a very simple one:

<a.*>(.*)</a>

However, you should be careful if you have several matches in the same line, e.g.

<a href="/folder/hey">hey.exe</a><a href="/folder/hey2/">hey2.dll</a>

In this case, the correct regex would be:

<a.*?>(.*?)</a>

Note the '?' after the '*' quantifier. By default, quantifiers are greedy, which means they eat as much characters as they can (meaning they would return only "hey2.dll" in this example). By appending a quotation mark, you make them ungreedy, which should better fit your needs.

Upvotes: 2

robmerica
robmerica

Reputation: 6103

Avoid using '.*' even if you make it ungreedy, until you have some more practice with RegEx. I think a good solution for you would be:

'/<a[^>]+>([^<]+)<\/a>/i'

Note the '/' delimiters - you must use the preg suite of regex functions in PHP. It would look like this:

preg_match_all($pattern, $string, $matches);
// matches get stored in '$matches' variable as an array
// matches in between the <a></a> tags will be in $matches[1]
print_r($matches);

Upvotes: 6

Douglas Leeder
Douglas Leeder

Reputation: 53310

<a href="[^"]*">([^<]*)</a>

Upvotes: 2

Chad Birch
Chad Birch

Reputation: 74548

This appears to work:

$pattern = '/<a.*?>(.*?)<\/a>/';

Upvotes: 2

Related Questions