marekful
marekful

Reputation: 15351

PHP 5.6 regex unexpected behaviour

I have come across a strange behaviour in PHP 5.6 (not tested with other versions)

var_dump(preg_match('#\b(39||90)\b#', '42')); // int(1)
var_dump(preg_match('#\b(39||90)\b#', '')); // int(0)

https://regex101.com says the pattern \b(39||90)\b is invalid but PHP preg_match does not return FALSE as it should if the pattern is invalid.

As you can see 42 produces a match and the empty string produces a non-match. I'd expect the other way round as || should stand for empty string.

What's happening here?

Upvotes: 1

Views: 227

Answers (1)

anubhava
anubhava

Reputation: 784998

This regex:

\b(39||90)\b

Will return a successful match if any of the alternation is matched. These are:

  1. Complete word 39
  2. Complete word 90
  3. A word boundary anywhere in the input (because of empty ||)

However in empty string there is no word boundary. A word boundary \b is asserted true between a word \w and a non-word \W.

E.g. see these results:

// no word character hence false
var_dump(preg_match('#\b(39||90)\b#', '#@'));
int(0)

# a word char hence true
php > var_dump(preg_match('#\b(39||90)\b#', 'a'));
int(1)

// no word character hence false
php > var_dump(preg_match('#\b(39||90)\b#', "\t\n"));
int(0)

Upvotes: 4

Related Questions