JavaScript regex pattern for any visible unicode letter characters

Question

I'm working on a JavaScript application that requires me to identify the set of "any visible Unicode letter characters, digits (0-9), spaces, underscores, and periods". The suggested regex pattern is ^[0-9\p{L} _\.]+$, but that doesn't seem to work in JavaScript. The part that is giving me trouble is "any visible Unicode letter characters" because that includes non-English characters. Is there some JavaScript regex pattern that can identify the Unicode letter character set?

Wiktor Stribiżew · Accepted Answer

Use XRegExp library to parse your current regular expression:

var pattern = new XRegExp("^[0-9\p{L} _.]+$");
var s = "123 Московская Street.";
if (XRegExp.test(s, pattern)) {
    console.log("Valid");
}

Note that ^[0-9\p{L} _\.]+$ matches

^ - start of string
[0-9\p{L} _\.]+ - one or more chars tha are:
- 0-9 - ASCII digits
- \p{L} - letters
- - space
- _ - an underscore
- . - a dot (inside a character class, . matches a literal dot, no need to escape)
$ - end of string.

If you want to also include the following conditions:

Names must be at least 3 characters long and no more than 16 characters long.
No player name can include the word "Riot" in it.

You may extend the pattern to the following:

var pattern = new XRegExp("^(?!.*\bRiot\b)[0-9\p{L} _\.]{3,16}$");
                            ^^^^^^^^^^^^^^^^                ^^^^^^

where + (1 or more occurrences) is replaced with {3,16} limiting quantifier (3 to 16 occurrences) and (?!.*\bRiot\b) negative lookahead will fail the match if there is a whole word (due to \b word boundaries) Riot is anywhere inside the string (or line, since . matches any char but line break chars).

JavaScript regex pattern for any visible unicode letter characters

Answers (1)

Related Questions