Reputation: 10346
I have the following pattern:
(COMPANY) -277.9887 (ASP,) -277.9887 (INC.)
I want the final output to be:
COMPANY ASP, INC.
Currently I have the following code and it keeps returning the original pattern ( I assume because the group all falls between the first '(' and last ')'
Pattern p = Pattern.compile("((.*))",Pattern.DOTALL);
Matcher matcher = p.matcher(eName);
while(matcher.find())
{
System.out.println("found match:"+matcher.group(1));
}
I am struggling to get the results I need and appreciate any help. I am not worried about concatenating the results after I get each group, just need to get each group.
Upvotes: 10
Views: 27382
Reputation: 1005
Tested with Java 8: /** * Below Pattern returns the string inside Parenthesis.
* Description about casting regular expression: \(+\s*([^\s)]+)\s*\)+
* \(+ : Exactly matches character "(" at least once
* \s* : matches zero to any number white character.
* ( : Start of Capturing group
* [^\s)]+: match any number of character except ^, ) and spaces.
* ) : Closing of capturing group.
* \s*: matches any white character(0 to any number of character)
* \)*: Exactly matches character ")" at least once.
private static Pattern REGULAR_EXPRESSION = Pattern.compile("\\(+\\s*([^\\s)]+)\\s*\\)+");
Upvotes: 1
Reputation: 19623
If your strings are always going to look like that, you could get away with just using a couple calls to replaceAll instead. This seems to work for me:
String eName = "(COMPANY) -277.9887 (ASP,) -277.9887 (INC.)";
String eNameEdited = eName.replaceAll("\\).*?\\("," ").replaceAll("\\(|\\)","");
System.out.println(eNameEdited);
Probably not the most efficient thing in the world, but fairly simple.
Upvotes: 0
Reputation: 11818
Your .* quantifier is 'greedy', so yes, it's grabbing everything between the first and last available parenthesis. As chaos says, tersely :), using the .*? is a non-greedy quantifier, so it will grab as little as possible while still maintaining the match.
And you need to escape the parenthesis within the regex, otherwise it becomes another group. That's assuming there are literal parenthesis in your string. I suspect what you referred to in the initial question as your pattern is in fact your string.
Query: are "COMPANY", "ASP," and "INC." required?
If you must have values for them, then you want to use + instead of *, the + is 1-or-more, the * is zero-or-more, so a * would match the literal string "()"
eg: "((.+?))"
Upvotes: 6
Reputation: 2205
Not a direct answer to your question but I recommend you use RegxTester to get to the answer and any future question quickly. It allows you to test in realtime.
Upvotes: 0