Nagasai Tenekondala
Nagasai Tenekondala

Reputation: 58

How to capture a group only if occurs twice in a line

import re

text = """
Tumble Trouble Twwixt Two Towns!
Was the Moon soon in the Sea
Or soon in the sky?
Nobody really knows YET.
"""

enter image description here

How should I make the match happen only when the occurence is found twice in a line?

Regular expression that highlights two 'o's that appear beside each other only if there is another occurence of two 'o's appearing beside each other subsequently in the same line

Upvotes: 2

Views: 487

Answers (1)

The fourth bird
The fourth bird

Reputation: 163362

You can match a single word char with a backreference, and group that again.

The word character will become group 2 as the groups are nested, then the outer group will be group 1.

Then you can assert group 1 using a positive lookahead again in the line.

((\w+)\2)(?=.*?\1)

The pattern matches:

  • ( Capture group 1
    • (\w+)\2 Match 1+ word chars in capture group 2 followed by a backreference to group 2 to match the same again
  • ) Close group 1
  • (?=.*?\1) Positive lookahead to assert the captured value of group 1 in the line

See a regex demo and a Python demo.

Example

print(re.compile(r"((\w+)\2)(?=.*?\1)").sub('{\g<1>}', text.rstrip()))

Output

Tumble Trouble Twwixt Two Towns!
Was the M{oo}n soon in the Sea
Or soon in the sky?
Nobody really knows YET.

Upvotes: 4

Related Questions