Rachan R K
Rachan R K

Reputation: 231

Java regular expression to extract content within square or round brackets

I am trying to extract string within square or round. The string may have only square or round brackets

I am using the below regex.

Pattern p = Pattern.compile("\\[(.*?)\\]|\\((.*?)\\)");

The ouput string includes enclosing brackets also. Below is the code.

String example = "Example_(xxxxx)_AND_(yyyyy)_2019-01-28";
Pattern p = Pattern.compile("\\[(.*?)\\]|\\((.*?)\\)");
Matcher m = p.matcher(example);
while(m.find()) {
    System.out.println(m.group(1));
}

The above pattern is giving output as

(xxxxx)

(yyyyy)

Expected output is

xxxxx

yyyyy

Upvotes: 1

Views: 1212

Answers (3)

Thuong Vo
Thuong Vo

Reputation: 101

This is a full example for you.

public class ExtractContentExample {

    private static final Pattern PATTERN2 = Pattern.compile("^[^\\(]{0,}\\(|([\\)][^\\(\\)]{1,}[\\(])|\\)[^\\)]{0,}$");

    public void test22212 () {
       String[] split = PATTERN2.split("(I )Comparison_(am )_AND_(so )_2019-01-28Comparison_(handsome!)");
       for (int i = 0; i< split.length; i++) {
           if (split[i] != null && !split[i].isEmpty()) {
               System.out.println(split[i]);
           }
       }
   }

}

I hope this will help

Upvotes: 3

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

You can write a regex that doesn't need to have alternation and can have only one group which you can uniquely access to get the value and even better if you use positive lookarounds to just capture your intended value using this regex,

(?<=[([])[^()[\]]*(?=[)\]])

Explanation:

  • (?<=[([]) - Positive look behind ensuring the preceding character is either ( or [
  • [^()[\]]* - Matches any character except opening or closing parenthesis
  • (?=[)\]]) - Positive look ahead to ensure it matches either ) or ]

Demo

Sample Java codes,

String s = "Example_(xxxxx)_AND_(yyyyy)_2019-01-28";
Pattern p = Pattern.compile("(?<=[(\\[])[^()\\[\\]]*(?=[)\\]])");
Matcher m = p.matcher(s);
while (m.find()) {
    System.out.println(m.group());
}

Prints,

xxxxx
yyyyy

Alternatively, like I mentioned above, you can use this non-look around regex and capture just the group1 to get your content as this regex doesn't have any alternation hence only one group.

[([]([^()[\]]*)[)\]]

Demo without lookaround regex

Sample Java code with non-look around regex where you need to capture using group(1)

String s = "Example_(xxxxx)_AND_(yyyyy)_2019-01-28";
Pattern p = Pattern.compile("[(\\[]([^()\\[\\]]*)[)\\]]");
Matcher m = p.matcher(s);
while (m.find()) {
    System.out.println(m.group(1));
}

Prints,

xxxxx
yyyyy

Upvotes: 6

Kartik
Kartik

Reputation: 7917

You can use lookahead and lookbehind:-

(?<=\[).*?(?=\])|(?<=\().*?(?=\))

or you can apply De Morgan's law to the above regex and use this:-

(?<=\[|\().*?(?=\]|\))

Explanation

(?<=\[|\() - preceded by [ or (
.*? - any number of characters, non-greedy
(?=\]|\)) - followed by ] or )

Demo

Upvotes: 2

Related Questions