CupOfTea696
CupOfTea696

Reputation: 1295

PHP regex backreference not working

I wrote a regex pattern which works perfectly when I test it in Regexr, but when I use it in my PHP code it doesn't always match when it should match.

The regular expression, including some examples that should and shouldn't match.

Example PHP code that should match but doesn't:

preg_match('/^([~]{3,})\s*([\w-]+)?\s*(?:\{([\w-\s]+)\})?\s*(\2[\w-]+)?\s*$/', "~~~ {class} lang", $matches);
echo var_dump($matches);

I believe the problem is caused by the backreference in the last capture group (\2[\w-]+), however, I can't quire figure out how to fix this.

Upvotes: 1

Views: 450

Answers (3)

CupOfTea696
CupOfTea696

Reputation: 1295

The answers below helped me figure out why it wasn't working. However both the answers would give a positive match for $str = '~~~ lang {class} lang'; which I didn't want. I fixed it my changing capturing group 2 to ([\w-]*) so that even if there is no string at that place, the capturing group exists but remains empty. This way all of the following strings match:

$str = '~~~   lang      {no-lines float left}   ';
$str = '~~~     {class}   ';
$str = '~~~ lang';
$str = '~~~ {class } lang ';
$str = '~~~';
$str = '~~~lang{class}';

But this one won't:

$str = '~~~ css {class} php';

Full solution:

$str = '~~~ {class} lang';
preg_match('/^([~]{3,})\s*([\w-]*)?\s*(?:\{([\w-\s]+)\})?\s*(\2[\w-]+)?\s*$/', $str, $matches);
var_dump($matches);

Upvotes: 0

hwnd
hwnd

Reputation: 70732

The problem is caused by capturing group #2, you have made this group optional. So since it may or may not exist, you need to make your backreference optional as well or else it always looks for a required group.

However, since all groups are optional I would just recurse the subpattern of the second group.

^(~{3,})\s*([\w-]+)?\s*(?:{([^}]+)})?\s*((?2))?\s*$

Example:

$str = '~~~ {class} lang';
preg_match('/^(~{3,})\s*([\w-]+)?\s*(?:{([^}]+)})?\s*((?2))?\s*$/', $str, $matches);
var_dump($matches);

Output

array(5) {
  [0]=> string(16) "~~~ {class} lang"
  [1]=> string(3) "~~~"
  [2]=> string(0) ""                   # Returns "" for optional groups that dont exist
  [3]=> string(5) "class"
  [4]=> string(4) "lang"
}

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174716

Because you're referring to a non-existing group(group 2). So remove \2 from the regex.

^([~]{3,})\s*([\w-]+)?\s*(?:\{([-\w\s]+)\})?\s*([\w-]+)?\s*$

DEMO

    ~~~  {class} lang
     |  |   |      |
  Group1| Group3 Group4
        |
Missing group 2

Upvotes: 3

Related Questions