Katana
Katana

Reputation: 421

Regex to detect repetition

I need a regex to detect different forms of repetitions (where the entire word is a multiple of same character/substring). The total length of the word should be minimum 7 (of the whole word, not of the repetitive sequence)

Example - Terms as follows are not allowed

abcdefabcdef
brian
2222222
john12john12

Terms as follows are allowed

hellojohn 
2122222222
abcdefabc

Upvotes: 0

Views: 290

Answers (1)

concision
concision

Reputation: 6387

The validity of this answer depends on the regular expression engine you are using, as it uses negative look-aheads to effectively "invert" the repeated substring matching. You can play with the regex solution here: https://regex101.com/r/DjmuaI/1/

Short answer: ^(?!(.+?)\1+).{7,}$

Long answer:

  • Start off by trying to match at least one repetition of a character sequence. This tries to capture a sequence of characters (.+) and uses a back-reference of this captured group \1.
    ^(.+)\1$
  • Allow more than 1 repetition by adding + to our capture group back-reference. This now detects a character sequence that is a substring repeated.
    ^(.+)\1+$
  • Look for character sequences that are NOT repeating. A negative-lookahead (?!regex) (which support varies between regex engines) allows us to invert the condition.
    ^(?!(.+?)\1+).+$
  • However, this would match any non-repetitive string (including strings less than 7 in length). The pattern can be changed to be 7 or more characters using {7,}.
    ^(?!(.+?)\1+).{7,}$

I will note that matching some strings may be not have great performance.

Upvotes: 2

Related Questions