Reputation: 405
I'm struggling with what I imagine is quite a simple regex for a preg_match_all() call. I'm looking to mimic the Wikimedia style internal links system which will turn something like this [[link]] into a link.
I'm looking for a regex that will search a string for any example of [[foobar]] and return "foobar" to me. foobar should really be wild.
I've tried the following:
<?php
$content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
$links = preg_match_all("[[*]]",$content,$matches);
print_r($matches);
?>
I'm not getting much of anything. Any help would be appreciated.
Upvotes: 1
Views: 76
Reputation: 157947
Use the following pattern /\[\[(.*)\]\]/U
:
$content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
$links = preg_match_all("/\[\[(.*)\]\]/U",$content,$matches);
print_r($matches);
Explanation. The regex needs to start and end with a delimiter its the /
. Square brackets [
have to be escaped in a regex like \[
. The content between the brackets must be inside a capture group (.*)
. At last the ungreedy modifier is been used U
to make sure that only the content between the nearest brackets will get captured. (remove to see its functionality)
Upvotes: 1
Reputation: 3633
You need to escape [
as \[
and then match the overall expression with the un-greedy flag U
.
$content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
$links = preg_match_all("/\[\[(.*)]]/U",$content,$matches);
print_r($matches);
Array(
[0] => Array (
[0] => [[sit]]
[1] => [[elit]]
)
[1] => Array (
[0] => sit
[1] => elit
)
)
EDIT: user ridgerunner pointed out that it's considered bad practice to use the /U
modifier because it turns all matching quantifiers greedy, including ungreedy ones. The suggested matching code is (.*?)
instead of what's posted above, and it produces the same equivalent answer.
$links = preg_match_all("/\[\[(.*?)]]/",$content,$matches);
Upvotes: 1
Reputation: 8293
*
alone doesn't mean anything. It's a quantifier, it needs to be with something else. In this case, a dot .
would do (means "anything"). Also, you can use lazy quantifiers instead of greedy ones to stop as soon as you encounter ]]
.
So...
$links = preg_match_all("/\[\[(.*?)]]/",$content,$matches);
Edit:
You have to escape the [
as they mark the beginning of character classes.
Upvotes: 4