user2462377
user2462377

Reputation: 11

java regular expression issue about capture group

public void test(){
    String source = "hello<a>goodA</a>boys can <a href=\"www.baidu.com\">goodB</a>\"\n"
                + "                + \"this can help";
    Pattern pattern = Pattern.compile("<a[\\s+.*?>|>](.*?)</a>");
    Matcher matcher = pattern.matcher(source);
    while (matcher.find()){
        System.out.println("laozhu:" + matcher.group(1));
    }
}

Output:

laozhu:goodA
laozhu:href="www.baidu.com">goodB

Why the second match is not laozhu:goodB?

Upvotes: 1

Views: 59

Answers (2)

nanz
nanz

Reputation: 11

    Pattern pattern = Pattern.compile("<a.*?>(.*?)</a>");

Upvotes: 1

Nikolas
Nikolas

Reputation: 44496

Try this Regex:

<a(?: .*?)?>(\w+)<\/a>

So your Pattern should look like this:

Pattern pattern = Pattern.compile("<a(?: .*?)?>(\\w+)<\\/a>");

It matches goodA and goodB.

For the detailed description, look here: Regex101.

Upvotes: 1

Related Questions