SubniC
SubniC

Reputation: 10317

PHP regexp strange behavior

I was developing a simple regex to parse part of a URL, the regex must be able to capture part of the url in a named group, there are only a few allowed characters (a-z0-9 and -) if other characters are present the regexp must fail for the given string and no capture will be done.

But as you can see on the screenshoot when the regexp find a % sign it stops, and capture the part before it (if it is longer than two chars), the result is the same without the word boundaries (\b).

I can't understand why % is acting like \n and the engine is capturing the previous chars and stopping the % is not in the allowed list of chars so it should fail for that string... or not?

I've tried in the actual PHP code as well, with the very same result.

enter image description here

EDIT 1:

Actual PHP code:

if (preg_match('/fixed_url_part/\b(?P<codename>[a-z0-9-]{2,})\b', $url, $regs)) {
    return $regs['codename'];
}

Upvotes: 1

Views: 58

Answers (1)

Halcyon
Halcyon

Reputation: 57729

You didn't tell it to match the full line. Add $ to have it match the end.

^/fixed_url_part/\b(?P<codename>[a-z0-9\-]{2,})\b$
^-- match start of line                          ^-- match end of line

Upvotes: 3

Related Questions