Gary
Gary

Reputation: 1017

Is there a way to test if my regex is vulnerable to catastrophic backtracking?

There are many questions on this topic, but I'm not sure if my regex is vulnerable or not. The following regex is what I use for email validation:

/^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$/.test(email)

Because I'm using a * in a few places, I suspect it could be.

I'd like to be able to test any number of occurrences in my code for problems.

I'm using Node.js so this could shut down my server entirely given the single threaded nature of the event loop.

Upvotes: 9

Views: 13777

Answers (2)

wp78de
wp78de

Reputation: 18980

Good question. Yes, given the right input, it's vulnerable and a runaway regex is able to block the entire node process, making the service unavailable.

The basic example of a regex prone to catastrophic backtracking looks like

^(\w+)*$

a pattern which can be found multiple times in the given regex.
When the regex contains optional quantifiers and the input contains long sequences that cannot be matched in the end the JS regex engine has to backtrack a lot and burns CPU. Potentially ad infinitum if the input is long enough. (You can play with this on regex101 as well using your regex by adjusting the timeout value in the settings.)

In general,

  • avoid monstrosities,
  • use HTML5 input validation whenever possible (in the front-end),
  • use established validation libraries for common input, e.g. validator.js,
  • try to detect potentially catastrophic exponential-time regular expressions ahead of time using tools like safe-regex & vuln-regex-detector (those offer pretty much what you had in mind),
  • and know your stuff 1, 2, 3 (I think the third link explains the issue best).

More drastic approaches to mitigate catastrophic backtracking in node.js are wrapping your regex efforts in a child process or vm context and set a meaningful timeout. (In a perfect world JavaScript's RegExp constructor would have a timeout param, maybe someday.)

  1. The approach of using a child process is described here on SO.

  2. The VM context (sandboxing) approach is described here.

Upvotes: 11

Gary
Gary

Reputation: 1017

const Joi = require('@hapi/joi');

function isEmail(emailAsStr) {
        const schema = Joi.object({ email: Joi.string().email() });
        const result = schema.validate({ email: emailAsStr });

        const validated = result.error ? false : true;

        if (validated) return true;
        return [false, result.error.details[0].message];
}

Here's another way to do it - use a library! :) I tested it against common catastrophic backtrack regex. The answer to my original question is to use the npm lib. safe-regex, but I thought I'd share another example of how to resolve this problem w/o regex.

Upvotes: -1

Related Questions