Java Regex Identifying Non-Consecutive Pairs of Characters/Numbers

Question

I'm looking to pick out pairs of word characters in a string, that may not necessarily be beside each other in the string. For example, "ahna32g" should match, due to the pair of 'a' characters. What I currently have is "\w*(\w)\1+\w*" which is successful if the matching characters are consecutive. I'm quite new to regular expressions, so if a detailed explanation could be given I'd really appreciate it.

Wiktor Stribiżew · Accepted Answer

You need to insert \w* between (\w) and \1 to let the regex engine match any 0+ word chars in between the repeating chars:

\w*(\w)\w*\1+\w*
       ^^^

See the regex demo.

So, the regex will match

\w* - 0+ word chars
(\w) - capture a word char into Group 1
\w* - will match 0+ word chars
\1+ - one or more occurrences of the value inside Group 1
\w* - 0+ word chars.

Java Regex Identifying Non-Consecutive Pairs of Characters/Numbers

Answers (2)

Related Questions