user3748973
user3748973

Reputation: 459

C# regex for English char and non-English

I use this ^[a-zA-Z''-'\s]{1,40}$ regex for name validator according to MSDN.

Now I want add NON-English characters to this.

How I can do this?

Upvotes: 1

Views: 2020

Answers (2)

Rahul Tripathi
Rahul Tripathi

Reputation: 172448

You can try this:

^[\p{L}'\s-]{1,40}$

Note that \p{L} is Unicode property and it matches everything that has the property letter.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626870

To support all BMP and astral planes, you need both \p{L} (all letters) and \p{M} (all diacritics) Unicode category classes:

^[\p{L}\p{M}\s'-]{1,40}$

Note that \p{L} already includes [a-zA-Z], and all lower- and uppercase letters.

Or, since \s matches newlines (I doubt you really need newline symbols to match), you can use \p{Zs} - Unicode separator class (various kinds of spaces):

^[\p{L}\p{M}\p{Zs}'-]{1,40}$

Placing the hyphen at the end is just best practice, although it would be handled as a literal hyphen in your regex, too.

Upvotes: 2

Related Questions