Meddah Abdallah
Meddah Abdallah

Reputation: 706

Solve Catastrophic Backtracking in my regex

I am trying to detect names (French names) that can be in the following formats

CHARLES

CHARLES-ANTOINE

D'URENBURRY

CHARLES--ANTOINE

CHARLES ANTOINE

And for that I have the following regex

/^([A-Z]+(([-]{0,2}|[']|[ ])?[A-Z]+))+$/

Which works, but GitHub's code scanner shows this error

This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'AA'.

I understand the error, however, I'm not sure how to solve it.

My question is: how to solve it.

Thank you

Upvotes: 1

Views: 1343

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You need to re-write the pattern like

^[A-Z]+(?:(?:-{1,2}|[' ])[A-Z]+)*$

See the regex demo. Now, each subsequent pattern will match different string, at different locations inside the string.

Details:

  • ^ - start of string
  • [A-Z]+ - one or more uppercase ASCII letters
  • (?:(?:-{1,2}|[' ])[A-Z]+)* - zero or more repetions of:
    • (?:-{1,2}|[' ]) - a non-capturing group matching one or two hyphens, a space or apostrophe
    • [A-Z]+ - one or more uppercase ASCII letters
  • $ - end of string.

Upvotes: 1

Related Questions