John
John

Reputation: 7846

preg_match - invalid range - very strange bug

The code is this:

$k = preg_replace('/[^a-zA-Z 0-9ßöäüÖÄÜ\"\._-\p{L}]/u', '', $k);  

(Yes I know it's redundant)
The error message:

Warning: preg_replace(): Compilation failed: invalid range in character class at offset 33

Now look at this line, it works fine:

$k= preg_replace('/[^a-zA-Z 0-9ßöäüÖÄÜ\"\.-_-\p{L}]/u', '', $k);

So adding or removing one "-" in the regex makes a huge change.
Both regex lines work when removing the\p{L}

Is that a bug in PHP (5.6.30) or did I miss something essential ? (it's 7 am here and I need sleep :)

Upvotes: 0

Views: 108

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89639

In a character class, a character range is defined with the character - (hyphen), but _-\p{L} doesn't define a range.

To figure a literal hyphen in a character class you have several possibilities in PHP:

  • escape it with a backslash
  • put it at the start of the class or after the negation character ^
  • put it at the end of the class
  • put it after a range or a shorthand character class.

This last one isn't well known and is the cause of your strange result. In the second pattern, you are in this situation:

   .-_    -     \p{L}
#  ^^^    ^---- the hyphen is after
#  '''--------- a range
# and in this case it is seen as a literal character

So, to answer your question, it isn't a bug.

Upvotes: 2

Related Questions