user2783755
user2783755

Reputation: 588

UTF 8 in preg_match

sorry for my English.

I’m trying to use preg_match with utf-8 in PHP.

preg_match("/\bjaunā\b Iel.*/iu", "Jaunā Iela");

Function returns 0. But

preg_match("/\bjauna\b Iel.*/iu", "Jauna Iela");

works fine. Why?
Thanks.

Upvotes: 0

Views: 109

Answers (1)

Bryan Elliott
Bryan Elliott

Reputation: 4095

Word boundaries don't work correctly with special chars. In the text Jaunā Iela the word bounderies are: \bJaun\bā \bIela\b

So instead of using word bounderies, try a look-ahead and look-behind assertion for a space. (or beginning of string) Like so:

The regex:

(?<=^|\s)Jaunā(?=\s) Iel.*

PHP:

preg_match("/(?<=^|\s)Jaunā(?=\s) Iel.*/i", "Jaunā Iela");

Working regex example:

http://regex101.com/r/tV6yR9

Upvotes: 1

Related Questions