Reputation: 1085
I already wrote the following Regex that allows all international characters (Latin, Asian, ...)
'Düsseldorf, Köln, Москва, 北京市, إسرائيل !@#$'.match(/[\p{L}-]+/ug)
But I would like to make it not allowing all special characters like !?})%....
Upvotes: 2
Views: 1849
Reputation: 33
There's a book called "Javascript, The Good Parts" that provides some good examples on this, in short you can do something like:
/^[a-zA-Z0-9 \u00C0-\u1FFF\u2800-\uFFFD]+$/
Upvotes: 1
Reputation: 896
Sadly, javascript regular expressions (compared to other programming languages) still have a poor support for UTF-8/UTF-16 characters, even if it is a planned feature.
Currently, there is no other option (I know) than to add ranges, which should look like:
new RegExp(/^[ \-.a-zšđčćžÀ-ÖØ-öø-ÿ]+$/i).test('St. Petersburg')
From your examples, it looks like you are looking for full UTF-16 support, so you will have to add some ranges yourself. You can use https://www.fileformat.info/info/charset/UTF-16/list.htm as a reference. It includes a description to identify which chars are letters and which not.
Upvotes: 1
Reputation: 18611
Matching string containing only letters, numbers, dashes, dots, commas and whitespace:
console.log(
/^[\p{L},.0-9\s-]+$/u.test('Düsseldorf, Köln, Москва, 北京市, إسرائيل !@#$')
)
console.log(
/^[\p{L},.0-9\s-]+$/u.test('Düsseldorf, Köln, Москва, 北京市, إسرائيل')
)
Results: false
and true
.
EXPLANATION
-------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[\p{L},.0-9\s-]+ any character of: letter, ',', '.',
'0' to '9', whitespace (\n, \r, \t, \f,
and " "), '-' (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
Upvotes: 9