northpole
northpole

Reputation: 10346

How do I match text within parentheses using regex?

I have the following pattern:

(COMPANY) -277.9887 (ASP,) -277.9887 (INC.) 

I want the final output to be:

COMPANY ASP, INC.

Currently I have the following code and it keeps returning the original pattern ( I assume because the group all falls between the first '(' and last ')'

Pattern p = Pattern.compile("((.*))",Pattern.DOTALL);
Matcher matcher = p.matcher(eName);
while(matcher.find())
{
    System.out.println("found match:"+matcher.group(1));
}

I am struggling to get the results I need and appreciate any help. I am not worried about concatenating the results after I get each group, just need to get each group.

Upvotes: 10

Views: 27382

Answers (5)

Chetan Laddha
Chetan Laddha

Reputation: 1005

Tested with Java 8: /** * Below Pattern returns the string inside Parenthesis.

* Description about casting regular expression: \(+\s*([^\s)]+)\s*\)+

* \(+ : Exactly matches character "(" at least once
* \s* : matches zero to any number white character.
* ( : Start of Capturing group
* [^\s)]+: match any number of character except ^, ) and spaces.
* ) : Closing of capturing group.
* \s*: matches any white character(0 to any number of character)
* \)*: Exactly matches character ")" at least once.


private static Pattern REGULAR_EXPRESSION = Pattern.compile("\\(+\\s*([^\\s)]+)\\s*\\)+");

Upvotes: 1

Brent Writes Code
Brent Writes Code

Reputation: 19623

If your strings are always going to look like that, you could get away with just using a couple calls to replaceAll instead. This seems to work for me:

String eName = "(COMPANY) -277.9887 (ASP,) -277.9887 (INC.)";
        String eNameEdited = eName.replaceAll("\\).*?\\("," ").replaceAll("\\(|\\)","");
        System.out.println(eNameEdited);

Probably not the most efficient thing in the world, but fairly simple.

Upvotes: 0

ptomli
ptomli

Reputation: 11818

Your .* quantifier is 'greedy', so yes, it's grabbing everything between the first and last available parenthesis. As chaos says, tersely :), using the .*? is a non-greedy quantifier, so it will grab as little as possible while still maintaining the match.

And you need to escape the parenthesis within the regex, otherwise it becomes another group. That's assuming there are literal parenthesis in your string. I suspect what you referred to in the initial question as your pattern is in fact your string.

Query: are "COMPANY", "ASP," and "INC." required?

If you must have values for them, then you want to use + instead of *, the + is 1-or-more, the * is zero-or-more, so a * would match the literal string "()"

eg: "((.+?))"

Upvotes: 6

Oliver
Oliver

Reputation: 2205

Not a direct answer to your question but I recommend you use RegxTester to get to the answer and any future question quickly. It allows you to test in realtime.

Upvotes: 0

chaos
chaos

Reputation: 124365

Pattern p = Pattern.compile("\\((.*?)\\)",Pattern.DOTALL);

Upvotes: 29

Related Questions