Allerion
Allerion

Reputation: 405

Struggling with php regex

I'm struggling with what I imagine is quite a simple regex for a preg_match_all() call. I'm looking to mimic the Wikimedia style internal links system which will turn something like this [[link]] into a link.

I'm looking for a regex that will search a string for any example of [[foobar]] and return "foobar" to me. foobar should really be wild.

I've tried the following:

<?php
 $content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
 $links = preg_match_all("[[*]]",$content,$matches);
 print_r($matches);
?>

I'm not getting much of anything. Any help would be appreciated.

Upvotes: 1

Views: 76

Answers (4)

hek2mgl
hek2mgl

Reputation: 157947

Use the following pattern /\[\[(.*)\]\]/U :

$content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
$links = preg_match_all("/\[\[(.*)\]\]/U",$content,$matches);
print_r($matches);

Explanation. The regex needs to start and end with a delimiter its the /. Square brackets [ have to be escaped in a regex like \[. The content between the brackets must be inside a capture group (.*). At last the ungreedy modifier is been used U to make sure that only the content between the nearest brackets will get captured. (remove to see its functionality)

Upvotes: 1

pp19dd
pp19dd

Reputation: 3633

You need to escape [ as \[ and then match the overall expression with the un-greedy flag U.

$content = "Lorem ipsum dolor [[sit]] amet, consectetur adipiscing [[elit]].";
$links = preg_match_all("/\[\[(.*)]]/U",$content,$matches);
print_r($matches);

Array(
    [0] => Array (
        [0] => [[sit]]
        [1] => [[elit]]
    )
    [1] => Array (
        [0] => sit
        [1] => elit
    )
)

EDIT: user ridgerunner pointed out that it's considered bad practice to use the /U modifier because it turns all matching quantifiers greedy, including ungreedy ones. The suggested matching code is (.*?) instead of what's posted above, and it produces the same equivalent answer.

$links = preg_match_all("/\[\[(.*?)]]/",$content,$matches);

Upvotes: 1

dikirill
dikirill

Reputation: 1903

preg_match_all("/\[\[([^\]]*?)\]\]/i",$content,$matches);

Upvotes: 2

Loamhoof
Loamhoof

Reputation: 8293

* alone doesn't mean anything. It's a quantifier, it needs to be with something else. In this case, a dot . would do (means "anything"). Also, you can use lazy quantifiers instead of greedy ones to stop as soon as you encounter ]].
So...

$links = preg_match_all("/\[\[(.*?)]]/",$content,$matches);

Edit:
You have to escape the [ as they mark the beginning of character classes.

Upvotes: 4

Related Questions