Zeus Carl
Zeus Carl

Reputation: 149

Block script character using regex

I want to block all character that has possible script such as #$%^&*<>~\[]{}@.,?|/

I cannot use ^[a-zA-Z]([\w -]*[a-zA-Z])?$/i.test(value) because at my application I have spanish lang support which includes alphabets like ę Æ and so on....

Now how can i achieve this forming a Regex? Can anyone help me here? New to RegEx

I want to block special character specified above. characters which can potential form a script. For restriction of user input purpose

Upvotes: 2

Views: 624

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626851

/^[a-zA-Z]([\w -]*[a-zA-Z])?$/i regex only matches ASCII characters.

If you plan to make it work with Spanish language, you need to make it Unicode aware.

Bearing in mind that a Unicode aware \w can be represented with [\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}] (see What's the correct regex range for javascript's regexes to match all the non word characters in any script?) and the Unicode letter pattern is \p{L}, the direct Unicode equivalent of your pattern is

/^\p{L}(?:[\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}\s-]*\p{L})?$/iu.test(value)

I also replaced the regular space with \s to match any kind of Unicode whitespace.

Details

  • ^ - start of string
  • \p{L} - any Unicode letter
  • (?:[\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\p{Join_Control}\s-]*\p{L})? - an optional occurrence of any 0 or more Unicode word chars (letter, diacritic, number, connector punctuation (like _), join control chars), whitespace or hyphens followed with a single Unicode letter
  • $ - end of string.

Upvotes: 1

Related Questions