marekful
marekful

Reputation: 15351

Regular expression - PCRE (PHP) - word boundary (\b) and accent characters

Why does the letter é count as a word boundary matching \b in the following example?

Pattern: /\b(cum)\b/i

Text: écumé

Matches 'cum' which is not desired.

Is it possible to overcome this?

Upvotes: 5

Views: 1152

Answers (2)

stema
stema

Reputation: 92996

It will work, when you add the u modifier to your regex

/\b(cum)\b/iu

Upvotes: 13

Toto
Toto

Reputation: 91428

To deal with unicode, replace \b with

/(?<=^|\PL)(cum)(?=\PL|$)/i

Upvotes: 0

Related Questions