Christian Richter
Christian Richter

Reputation: 358

Finding all possible occurences using regular an regular expression

I have a String which looks like this for example:

i installed apache2 and when i transfered the httpd.conf to the new structure

I am trying to find all matches to the regular expression i.*structure

My code looks like this

List<String> matches = new ArrayList<>();
Pattern p = Pattern.compile("i.*structure", Pattern.MULTILINE|Pattern.DOTALL);
Matcher m = p.matcher(text);
while (m.find()) {
  matches.add(m.group());
}
System.out.println(matches);

The last line outputs the following:

[i installed apache2 and when i transfered the httpd.conf to the new structure]

What I expect would be:

[i installed apache2 and when i transfered the httpd.conf to the new structure, 
 installed apache2 and when i transfered the httpd.conf to the new structure, 
 i transfered the httpd.conf to the new structure]

Can anyone explain to me what I did wrong?

Thanks & regards

Upvotes: 0

Views: 54

Answers (1)

hwnd
hwnd

Reputation: 70750

You can use a Positive Lookahead to capture the overlapping matches.

Pattern p = Pattern.compile("(?s)(?=(i.*?structure))");

A lookahead does not "consume" any characters on the string.

After looking ahead, the regular expression engine is back at the same position on the string from where it started looking. From there, it can start matching again ...

Note: * is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match. You want to use *? instead for a non-greedy match meaning "zero or more — preferably as few as possible".

Ideone Demo

Upvotes: 2

Related Questions