htmn
htmn

Reputation: 1675

Is there a way to simplify regex matching characters?

I'm trying to build a ECMAScript (JavaScript) flavor regex to test the strength of my password based on these criteria:

    Characters Used          Password Strength Length
   ABC  abc  123  #$&      WEAK  ...
1   x                      1-5   ...   
2        x                 1-5
3             x            1-7
4                  x       1-5
5   x    x                 1-4   
6   x         x            1-4
7   x              x       1-4
8        x    x            1-4 
9        x         x       1-4       
10            x    x       1-4     
11  x    x    x            1-4           
12  x    x         x       1-3   
13  x        x     x       1-4           
14      x    x     x       1-4  
15  x   x    x     x       1-3     

So passwords like 2, ABCD, 0123456, abCd, aA#, etc. should be marked as weak. Passwords that are longer for the specified combination 012345678, aA1#, etc. should not.

This is my very long regex atm (which is basically glued together through groups according to the table above):

/^(([A-Za-z&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,3})|([a-z0-9&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,4})|([A-Z0-9&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,4})|([a-zA-Z0-9]{1,4})|([a-z]{1,5})|([A-Z]{1,5})|([0-9]{1,7})|([&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,5}))$/

Matches rows (above table): 12

/([A-Za-z&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,3})/

Matches rows: 14, 9

/([a-z0-9&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,4})/

Matches rows: 13, 10, 7

/([A-Z0-9&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,4})/

Matches rows: 11, 8, 6, 5

/([a-zA-Z0-9]{1,4})/

Matches rows: 2

/([a-z]{1,5})/

Matches rows: 1

/([A-Z]{1,5})/

Matches rows: 3

/([0-9]{1,7})/

Matches rows: 4

/([&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,5})/

Is there a way to reuse the special characters that I specified inside [] [&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.], so I don't have to write all of them inside every group ?

Upvotes: 1

Views: 432

Answers (2)

Sree Kumar
Sree Kumar

Reputation: 2245

I am not sure which regex engine you are using. However, if it is Perl or Ruby, you may use subroutines to achieve this to a good extent. Subroutines are repeating patterns within a regex.

You may be aware of backreferences. However, subroutines are different from backreferences. In the case of a backreference, the content of the captured group is matched and not the pattern. However, in the case of a subroutine, the pattern matching is repeated.

Let us take an example.

  • Test string: abcd-defg
  • Regex in Ruby: /(?<dd>[a-z]+)-\g<dd>/. The match will succeed.
  • The \g<dd> part is the subroutine call in Ruby to the group named dd. (\g<group_name> is Ruby regex style. For the engine you are using, it may be different.)

More details here:


In your case, I think you can name each group when it appears the first time in the regex and then refer to it as a subroutine subsequently. Eg, let us call

  • [A-Z] as A
  • [a-z] as a
  • [0-9] as n
  • the special character set as s (Not sure if I got this pattern right.)

Then, the pattern for 12 /([A-Za-z&*@\^}\]\\):,$=!><–{[(%+#;\/~_?.]{1,3})/ becomes

/((\g<A>|\g<a>|\g<s>)){1,3})/

That is, A OR a OR s repeated 1-3 times.

And the pattern for 11, 8, 6, 5 /([a-zA-Z0-9]{1,4})/ becomes

/((\g<a>|\g<A>|\g<n>)){1,4})/

That is, a OR A OR n repeated 1-4 times.

Upvotes: 1

T.J. Crowder
T.J. Crowder

Reputation: 1075039

Is there a way to reuse the special characters that I specified inside []...so I don't have to write all of them inside every group ?

Not with a regular expression literal, no.

You can do it with the RegExp constructor, though. You can mitigate the fact it wants a string by using String.raw so you don't have to worry about escaping backslashes:

const chars = String.raw`[the chars]`;
const rex = new RegExp(String.raw`^...${chars}...${chars}...$`);

You could take it further by creating a specific tag function for that, like this (this is an example from Chapter 10 of my new book; see my profile for details):

const createRegex = (template, ...values) => {
    // Build the source from the raw text segments and values
    const source = String.raw(template, ...values);
    // Check it's in /expr/flags form
    const match = /^\/(.+)\/([a-z]*)$/.exec(source);
    if (!match) {
        throw new Error("Invalid regular expression");
    }
    // Get the expression and flags, create
    const [, expr, flags = ""] = match;
    return new RegExp(expr, flags);
};

Then:

const chars = String.raw`[the chars]`;
const rex = createRegex`/^...${chars}...${chars}...$/`;

Upvotes: 1

Related Questions