dooffas
dooffas

Reputation: 484

PHP Regular Expression will not match accented characters

I am trying to build a regular expression, compatible with PHP that will allow accented characters, for example 'ü'. From what I understand the \p{L} operator should do this. What I have so far:

/^[a-z0-9\p{L}][a-z0-9_\p{L}]*/i

This should allow a string that can start with any a-z, 0-9 and accented characters and can then be followed by any amount of a-z, 0-9 and accented characters and the entire expression is case insensitive.

However in testing, when using characters such as 'ü' anywhere in the string, the validation fails. I have made sure the value being passed is encoded with utf8 by using:

utf8_encode($value)

However it still fails. Any suggestions?

Thanks in advance

-------------------------Edit-------------------------

After testing on another server, the original pattern also works.

/^[a-z0-9\p{L}][a-z0-9_\p{L}]*/i

The issue appears to be with the server set up. I will post the solution when found.

Upvotes: 0

Views: 340

Answers (1)

Expedito
Expedito

Reputation: 7805

I think this might work for you:

$pattern =  '/^[0-9a-zá-úàü][0-9_a-zá-úàü]*$/i';

I ran the following code to test the pattern:

$str = "patinação";
$pattern =  '/^[0-9a-zá-úàü][0-9_a-zá-úàü]*$/i';
if (preg_match($pattern, $str, $matches)){
    echo $matches[0];//output: patinação
}

Upvotes: 1

Related Questions