Evgeny
Evgeny

Reputation: 2161

Java regex - overlapping matches

In the following code:

public static void main(String[] args) {
    List<String> allMatches = new ArrayList<String>();
    Matcher m = Pattern.compile("\\d+\\D+\\d+").matcher("2abc3abc4abc5");
    while (m.find()) {
        allMatches.add(m.group());
    }

    String[] res = allMatches.toArray(new String[0]);
    System.out.println(Arrays.toString(res));
}

The result is:

[2abc3, 4abc5]

I'd like it to be

[2abc3, 3abc4, 4abc5]

How can it be achieved?

Upvotes: 27

Views: 8222

Answers (3)

Fluch
Fluch

Reputation: 353

The above solution of HamZa works perfectly in Java. If you want to find a specific pattern in a text all you have to do is:

String regex = "\\d+\\D+\\d+";

String updatedRegex = "(?=(" + regex + ")).";

Where the regex is the pattern you are looking for and to be overlapping you need to surround it with (?=(" at the start and ")). at the end.

Upvotes: 3

HamZa
HamZa

Reputation: 14921

Not sure if this is possible in Java, but in PCRE you could do the following:
(?=(\d+\D+\d+)).

Explanation
The technique is to use a matching group in a lookahead, and then "eat" one character to move forward.

  • (?= : start of positive lookahead
    • ( : start matching group 1
      • \d+ : match a digit one or more times
      • \D+ : match a non-digit character one or more times
      • \d+ : match a digit one or more times
    • ) : end of group 1
  • ) : end of lookahead
  • . : match anything, this is to "move forward".

Online demo


Thanks to Casimir et Hippolyte it really seems to work in Java. You just need to add backslashes and display the first capturing group: (?=(\\d+\\D+\\d+)).. Tested on www.regexplanet.com:

enter image description here

Upvotes: 16

johnchen902
johnchen902

Reputation: 9601

Make the matcher attempt to start its next scan from the latter \d+.

Matcher m = Pattern.compile("\\d+\\D+(\\d+)").matcher("2abc3abc4abc5");
if (m.find()) {
    do {
        allMatches.add(m.group());
    } while (m.find(m.start(1)));
}

Upvotes: 18

Related Questions