Reputation: 2962
I am attempting to use preg_match_all to extract a repeated pattern out of an html string.
The problem seems to be that my pattern has a defined beginning and end, but a wildcard portion in between. So the preg_match_all
ends up only getting the biggest match, but not the individual matches.
My ultimate goal is to isolate each <a ...>some text</a>
out of an html string, and to wrap them as so: <font ...><a ...>some text</a></font>
.
But first off I want to simply successfully isolate them each:
$lvs_regex = "/<a.+<\/a>/" ;
$lvs_test = "click <a href='...'>AAA</a> now, <a href='...'>BBB</a> later, <a href='...'>CCC</a> tomorrow" ;
preg_match_all( $lvs_regex , $lvs_test , $matches ) ;
for($i = 0 ; $i < count( $matches ) ; $i++ )
{ print $matches[ $i ][0] . "<br/>" ;
}
The return that I want:
[0] => <a href='...'>AAA</a>
[1] => <a href='...'>BBB</a>
[2] => <a href='...'>CCC</a>
But I only get one match:
[0] => <a href='...'>AAA</a> now, <a href='...'>BBB</a> later, <a href='...'>CCC</a>
Upvotes: 0
Views: 95
Reputation: 2236
$lvs_regex = "/<a.+<\/a>/U" ;
$lvs_test = "click <a href='...'>AAA</a> now, <a href='...'>BBB</a> later, <a href='...'>CCC</a> tomorrow" ;
preg_match_all( $lvs_regex , $lvs_test , $matches ) ;
if ($matches) {
foreach ($matches[0] as $match) {
print $match."\n";
}
}
Result is:
<a href='...'>AAA</a>
<a href='...'>BBB</a>
<a href='...'>CCC</a>
Use 'ungreedy' specificator /U
http://www.php.net/manual/fa/reference.pcre.pattern.modifiers.php
Upvotes: 0
Reputation: 3695
Maybe something like this:
$lvs_regex = "/<a.*?<\/a>/" ;
$lvs_test = "click <a href='...'>AAA</a> now, <a href='...'>BBB</a> later, <a href='...'>CCC</a> tomorrow" ;
preg_match_all( $lvs_regex , $lvs_test , $matches);
Basically the pattern needed is /<a.*?<\/a>/
. This match every occurrence in your string.
Now, var_dump($matches[0])
gives
array (size=3)
0 => string '<a href='...'>AAA</a>' (length=21)
1 => string '<a href='...'>BBB</a>' (length=21)
2 => string '<a href='...'>CCC</a>' (length=21)
that is the return that you want.
So by following with
for($i = 0 ; $i < count( $matches[0] ) ; $i++ )
{
var_dump($matches[0][ $i ] . "<br/>");
}
you see now it's matching every occurrence:
string '<a href='...'>AAA</a><br/>' (length=26)
string '<a href='...'>BBB</a><br/>' (length=26)
string '<a href='...'>CCC</a><br/>' (length=26)
-------- NEW EDIT ---------
So now you can modifiy your loop in order to wrap every a
tag matched.
$result='';
for($i = 0 ; $i < count( $matches[0] ) ; $i++ )
{
$result .= "<font ...>".$matches[0][ $i ] . "</font><br/>";
}
var_dump($result);
And you get
<font ...><a href='...'>AAA</a></font><br/><font ...><a href='...'>BBB</a></font><br/><font ...><a href='...'>CCC</a></font><br/>
---------- NEW EDIT ----------
As suggested @Casimir et Hippolyte by you can avoid the matching of "wrong or unwanted" tag as abbr
by adding a word boudary in the pattern:
$lvs_regex = "/<a\b.*?<\/a>/" ;
and optionally obtain the same result by using a foreach instead of a for loop. Ex:
foreach($matches[0] as $matches)
{
$result .= "<font ...>".$matches . "</font><br/>";
}
And a link about foreach
internal behaviour, in case you would get a deep look at the construct.
Upvotes: 1