Regex: match two occurrences of the same unknown substring

Question

I am trying write a regex to match two occurrences of an unknown substrings in a string.

For example: 11=11 should be valid since 11 occurs twice with equal sign in the middle. ashg=hgasfa is also valid since hg occurs twice. ' or 1=1 ' should be valid since 1 occurs twice.

More specifically, in my project, I'm trying to match all strings data where two sides of the equal sign contain the same strings.

I remember using variables like $1 or $2 when modifying Apache route files or even writing Sublime Text snippets. How can I implement this functionality? Can I write a regex like .* $1=$1 .*? Is this even possible to accomplish with Regex?

Avinash Raj · Accepted Answer

You need to use back-referncing (like \1) in-order to refer those characters which are present inside a particular group index.

>>> s = 'ashg=hgasfa'
>>> re.search(r'([^=]+)=\1', s)
<_sre.SRE_Match object; span=(2, 7), match='hg=hg'>
>>> re.search(r'([^=]+)=\1', 'ashg=hgasfa').group(1)
'hg'
>>> re.search(r'([^=]+)=\1', '11=11').group(1)
'11'
>>> re.search(r'([^=]+)=\1', ' or 1=1 ').group(1)
'1'

([^=]+) in the above captures one or more characters (but not of = symbol) which exists just before to = symbol and then the regex engine checks for the existence of same set of characters next to = symbol. If yes, then it would return a match object, else it won't.

Regex: match two occurrences of the same unknown substring

Answers (1)

Related Questions