Reputation: 7617
I'm trying to work out how to split a string into groups. I don't think the split(regex)
method will suffice on it's own.
I have String complexStatement = "(this && that)||(these&&those)||(me&&you)";
and I would like an array out with this kind of form:
"(this && that)","(these&&those)","(me&&you)""
If I had "(5+3)*(2+5)+(9)"
then I'd like to have "(5+3)","(2+5)","(9)".
(bonus points if you can somehow keep the join information, e.g. *,+,||
)
Is this possible for an arbitrary string input? I'm playing with a StringTokenizer but I haven't quite gotten to grips with it yet.
Upvotes: 5
Views: 3276
Reputation: 88378
If you want to capture the groups defined only by parentheses at the outermost level, you are outside of the world of regular expressions and will need to parse the input. StinePike's approach is good; another one (in messy pseudocode) is as follows:
insides = []
outsides = []
nesting_level = 0
string = ""
while not done_reading_input():
char = get_next_char()
if nesting_level > 0 or char not in ['(', ')']:
string += char
if char == '('
if nesting_level == 0:
outsides.add(string)
string = ""
nesting_level += 1
elif char == ')':
nesting_level -= 1
if nesting_level == 0:
insides.add(string)
string = ""
If the very first character in your input is a '(', you'll get an extra string in your outsides
array, but you can fix that without much trouble.
If you are interested in nested parentheses then you will not be producing just two arrays as output; you will need a tree.
Upvotes: 1
Reputation: 7116
You can use the bellow code:
String str = "(this && that)\",\"(these&&those)\",\"(me&&you)";
Pattern pattern = Pattern.compile("\\(([^\\)]+)\\)");
Matcher m = pattern.matcher(str);
while (m.find()){
System.out.println(m.group(0));
}
\\(([^\\)]+)\\)
will dig you anything within the parenthesis, look like what you want!:
Edit:
To capture content between )
and (
just replace the regular expression with \\)([^\\(]+)\\(
this one!
Upvotes: 5
Reputation: 54672
I think you better implement the parsing instead of depending on any ready-made methods.
Here is my suggestion... I am assuming the format of input will be always like followig
(value1+operator+value2)+operator+(value3+operator+value4)+........
[here operator can be different, and + is just showing concatanation).
If the above assumptio is true then you can do the following.
N.B. it's just and pseudo code with primitive thinking.
Upvotes: 2