PHP regexp strange behavior

Question

I was developing a simple regex to parse part of a URL, the regex must be able to capture part of the url in a named group, there are only a few allowed characters (a-z0-9 and -) if other characters are present the regexp must fail for the given string and no capture will be done.

But as you can see on the screenshoot when the regexp find a % sign it stops, and capture the part before it (if it is longer than two chars), the result is the same without the word boundaries (\b).

I can't understand why % is acting like and the engine is capturing the previous chars and stopping the % is not in the allowed list of chars so it should fail for that string... or not?

I've tried in the actual PHP code as well, with the very same result.

EDIT 1:

Actual PHP code:

if (preg_match('/fixed_url_part/\b(?P[a-z0-9-]{2,})\b', $url, $regs)) {
    return $regs['codename'];
}

Halcyon · Accepted Answer

You didn't tell it to match the full line. Add $ to have it match the end.

^/fixed_url_part/\b(?P[a-z0-9\-]{2,})\b$
^-- match start of line                          ^-- match end of line

PHP regexp strange behavior

Answers (1)

Related Questions