Miguel Moura
Miguel Moura

Reputation: 39364

Regex to include latin characters

I have a Regex to ensure a Password has upper and lowercase characters and limited to a few symbols:

^(?:(?=[^a-z]*[a-z])(?=[^A-Z]*[A-Z])(?=.*[$@$!%*?&,;.:-_])[A-Za-z\d$@$!%*?&,;.:-_]+)?$

NOTE: it allow empty password. I am checking that another way.

However it does not allow the use of latin characters such as ç, á, õ, etc.

How can I add this type of characters?

UPDATE

I am trying to create a Regex for Password validation that is in sync with Microsoft options, e.g:

RequireDigit (Default = true) 
  Requires a number between 0-9 in the password.

RequireNonAlphanumeric (Default = true)     
  Requires a non-alphanumeric character in the password.

RequireUppercase (Default = true)   
  Requires an upper case character in the password.

RequireLowercase (Default = true)   
  Requires a lower case character in the password.

RequiredUniqueChars (Default = 1)   
  Requires the number of distinct characters in the password

Microsoft documentation: https://learn.microsoft.com/en-us/aspnet/core/security/authentication/identity-configuration?tabs=aspnetcore2x

So I would like to have a Regex with block for each one so I can change it simply by adding or removing the rule and do any combination.

Does this make sense?

Upvotes: 4

Views: 1904

Answers (1)

ctwheels
ctwheels

Reputation: 22817

Overview

It's generally bad practice to restrict passwords, so if that's the intention, please don't use the following regex. In any case, I understand some people like to at least ensure some character sets exist (uppercase, lowercase, number, symbol, etc.) and that there are special cases when things like this are needed. The regex below ensures at least one lowercase, uppercase, number and symbol (in any language/script) exists in a string of at least 8 characters.

As comments below the question suggest, restricting passwords to a specific set of characters or a specific format is just asking for trouble. As suggested by @maccettura, an attacker can filter a Dictionary attack and eliminate a great number of dictionary items that don't match your password format. Writing, for example [A-Za-z\d$@$!%*?&,;.:-_], the attacker can simply remove any passwords containing characters other than the ones in the list. That list also only contains 75 characters. How many permutations of 75 characters exist? For passwords that are 8 characters in length that's 680,240,886,192,000 permutations (less if we remove the ones that don't match the regex below). How long will it take your CPU to crack the password?

See the following StackExchange posts on passwords:

Other articles:

Code

See regex in use here

^(?=\P{Ll}*\p{Ll})(?=\P{Lu}*\p{Lu})(?=\P{N}*\p{N})(?=.*[^\p{‌​L}\p{N}\p{C}]).{8,}$

Explanation

\p{x} represents a Unicode general category or named block specified by x

  • ^ Assert position at the start of the line
  • (?=\P{Ll}*\p{Ll}) Ensures at least one lowercase letter in any script exists
  • (?=\P{Lu}*\p{Lu}) Ensures at least one uppercase letter in any script exists
  • (?=\P{N}*\p{N}) Ensures at least one number character in any script exists
  • (?=.*[^\p{‌​L}\p{N}\p{C}]) Ensures any character other than a letter, number or control character exists
  • .{8,} Ensures the password is at least 8 characters in length (and isn't limited on the upper bound)
  • $ Assert position at the end of the line

Upvotes: 3

Related Questions