sh4nkc
sh4nkc

Reputation: 358

Regex: match two occurrences of the same unknown substring

I am trying write a regex to match two occurrences of an unknown substrings in a string.

For example: 11=11 should be valid since 11 occurs twice with equal sign in the middle. ashg=hgasfa is also valid since hg occurs twice. ' or 1=1 ' should be valid since 1 occurs twice.

More specifically, in my project, I'm trying to match all strings data where two sides of the equal sign contain the same strings.


I remember using variables like $1 or $2 when modifying Apache route files or even writing Sublime Text snippets. How can I implement this functionality? Can I write a regex like .* $1=$1 .*? Is this even possible to accomplish with Regex?

Upvotes: 2

Views: 818

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174696

You need to use back-referncing (like \1) in-order to refer those characters which are present inside a particular group index.

>>> s = 'ashg=hgasfa'
>>> re.search(r'([^=]+)=\1', s)
<_sre.SRE_Match object; span=(2, 7), match='hg=hg'>
>>> re.search(r'([^=]+)=\1', 'ashg=hgasfa').group(1)
'hg'
>>> re.search(r'([^=]+)=\1', '11=11').group(1)
'11'
>>> re.search(r'([^=]+)=\1', ' or 1=1 ').group(1)
'1'

([^=]+) in the above captures one or more characters (but not of = symbol) which exists just before to = symbol and then the regex engine checks for the existence of same set of characters next to = symbol. If yes, then it would return a match object, else it won't.

Upvotes: 6

Related Questions