Reputation: 159
I'm looking to pick out pairs of word characters in a string, that may not necessarily be beside each other in the string. For example, "ahna32g" should match, due to the pair of 'a' characters. What I currently have is "\w*(\w)\1+\w*"
which is successful if the matching characters are consecutive. I'm quite new to regular expressions, so if a detailed explanation could be given I'd really appreciate it.
Upvotes: 1
Views: 743
Reputation: 887
My approach captures the (earliest) repeated characters in group 1 and 2. It's also less steps than the already posted answer.
\w*?(\w)(?=\w*?(\1))\w*
\w*? // 0 or more word chars, lazily matched
(\w) // a word char (as group 1)
(?= // look ahead and assert a match of:
\w*? // 0 or more word chars, lazily matched
(\1) // group 1 (as group 2)
) // end of assertion
\w* // 0 or more word chars
Upvotes: 0
Reputation: 627327
You need to insert \w*
between (\w)
and \1
to let the regex engine match any 0+ word chars in between the repeating chars:
\w*(\w)\w*\1+\w*
^^^
See the regex demo.
So, the regex will match
\w*
- 0+ word chars(\w)
- capture a word char into Group 1\w*
- will match 0+ word chars\1+
- one or more occurrences of the value inside Group 1\w*
- 0+ word chars.Upvotes: 2