mrmuggles
mrmuggles

Reputation: 2141

Regex to find strings contained between separators

in this text :

text text text [[st: aaa bbb ccc ddd eee fff]] text text
text text [[st: ggg hhh iii jjj kkk
lll mmm nnn]] text text text

I'm trying to get the text between the [[st: and that ends with ]]

My program should output:

aaa bbb ccc ddd eee fff  (first match)
ggg hhh iii jjj kkk \n lll mmm nnn(second match)

But I can only get it to return the first [[st: and the last ]], so there is just one match instead of two. Any ideas?

Here's my code:

package com.s2i.egc.test;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TestRegex {

    /**
     * @param args
     */
    public static void main(String[] args) {

        String bodyText = "text text text [[st: aaa bbb ccc ddd eee fff]] text text text text [[st: ggg hhh iii jjj kkk\n lll mmm nnn]] text text text";

        String currentPattern = "\\[\\[st:.*\\]\\]";

        Pattern myPattern = Pattern.compile(currentPattern, Pattern.DOTALL);

        Matcher myMatcher = myPattern.matcher(bodyText);

        int i = 1;

        while (myMatcher.find()) {
          String match = bodyText.substring(myMatcher.start() + 5, myMatcher.end() - 3);
          System.out.println(match + " (match #" + i + ")");
          i++;
        }                           


    }

}

Upvotes: 1

Views: 2057

Answers (3)

Joel Hoffman
Joel Hoffman

Reputation: 346

Just for completeness' sake, without the non-greedy star, you could match the opening [[st:, followed by any non-] characters, possibly including sequences of ] characters followed by non-] characters, finally followed by ]]:

\[\[st:([^\]]*(?:\][^\]]+)*)\]\]

Upvotes: 1

Simon Nickerson
Simon Nickerson

Reputation: 43159

The quantifier * (0 or more) is greedy by default, so it matches to the second ]].

Try changing to a reluctant pattern match:

String currentPattern = "\\[\\[st:.*?\\]\\]";

Upvotes: 3

Dror
Dror

Reputation: 7305

You should use lazy mode for the asterisk

.*  

use instead:

"\\[\\[st:.*?\\]\\]"

Upvotes: 2

Related Questions