Reputation: 37
(\W*\d+)*(?=\W|$)
I'm having a problem with this regex, causing the system to crash when a certain term is searched.
I'm trying to find way to remove the catastrophic backtracking without changing its logic, but so far I've got nothing.
Term: 0000000000000000000000000000Abc
You can test it, and see the timeout, here:
Upvotes: 1
Views: 658
Reputation: 626747
You need to remove +
after \d
since otherwise, it would take regex engine too long to test all possible ways of matching a non-matching string before admitting the no match.
Use
(?:\W*\d)*(?!\w)
or
(?:\W*\d)*\b
The \W
matches any non-word char (so, no digit) and when d
is matched, \W*
will consume 0 or more non-word chars enabling a "linear" way of matching, when the subsequent subpattern does not match the same text as the preceding subpattern.
The (?!\w)
lookahead works a bit faster than an alternation group (?:\W|$)
as all it has to do it check if the next char is a word char, and if yes, the match is failed. Actually, in this situation, after \d
, (?!\w)
is equal to \b
, a word boundary, so it is the best construct to use in this position.
See the regex demo.
Upvotes: 2