Reputation: 4951

Difference between these two regular expressions?

What's the difference between these two regular expressions?(using php preg_match())

/^[0-9\x{06F0}-\x{06F9}]{1,}$/u

/^[0-9\x{06F0}-\x{06F9}\x]{1,}$/u

What's the meaning of the last \x in the second pattern?

Upvotes: 0

Answers (5)

dpk2442

Reputation: 701

As far as I can tell, the second \x is actually an invalid character. Do both expressions work?

Upvotes: 0

user557597

Reputation:

This is weird. Php notation for a unicode character is \x{}. In perl, it is the same thing.

But php has the //u modifier in regex's. I asume that means unicode. No such modifier in perl.

In perl regex, \x## is parsed, where ## is required to denote an ascii character. If its \x or \x#, its a warning of illeagal hex digit ignored (because it requires 2 digits, no more no less) and it takes only the valid hex digits in the sequence. If you have no digits as in \x, it uses \0 ascii char etc..

However, any \x{} notation is ok, and \x{0} is equivalend to \x{}. And \x{0}-\x{ff} is considered ascii, \x{100}- is considered unicode.

So, \x is a valid hex/unicode escape sequence but by itself its asumed hex and is incomplete and probably not something that should be left to parser default mechanisms.

Upvotes: 0