AncientSwordRage
AncientSwordRage

Reputation: 7617

How can I split a string into groups?

I'm trying to work out how to split a string into groups. I don't think the split(regex) method will suffice on it's own.

I have String complexStatement = "(this && that)||(these&&those)||(me&&you)"; and I would like an array out with this kind of form:

"(this && that)","(these&&those)","(me&&you)""

If I had "(5+3)*(2+5)+(9)" then I'd like to have "(5+3)","(2+5)","(9)".
(bonus points if you can somehow keep the join information, e.g. *,+,||)

Is this possible for an arbitrary string input? I'm playing with a StringTokenizer but I haven't quite gotten to grips with it yet.

Upvotes: 5

Views: 3276

Answers (3)

Ray Toal
Ray Toal

Reputation: 88378

If you want to capture the groups defined only by parentheses at the outermost level, you are outside of the world of regular expressions and will need to parse the input. StinePike's approach is good; another one (in messy pseudocode) is as follows:

insides = []
outsides = []
nesting_level = 0
string = ""
while not done_reading_input():
    char = get_next_char()
    if nesting_level > 0 or char not in ['(', ')']:
        string += char
    if char == '('
        if nesting_level == 0:
            outsides.add(string)
            string = ""
        nesting_level += 1
    elif char == ')':
        nesting_level -= 1
        if nesting_level == 0:
            insides.add(string)
            string = ""

If the very first character in your input is a '(', you'll get an extra string in your outsides array, but you can fix that without much trouble.

If you are interested in nested parentheses then you will not be producing just two arrays as output; you will need a tree.

Upvotes: 1

Sazzadur Rahaman
Sazzadur Rahaman

Reputation: 7116

You can use the bellow code:

    String str = "(this && that)\",\"(these&&those)\",\"(me&&you)";
    Pattern pattern = Pattern.compile("\\(([^\\)]+)\\)");
    Matcher m = pattern.matcher(str);
    while (m.find()){
        System.out.println(m.group(0));
    }

\\(([^\\)]+)\\) will dig you anything within the parenthesis, look like what you want!:

Edit:

To capture content between ) and ( just replace the regular expression with \\)([^\\(]+)\\( this one!

Upvotes: 5

stinepike
stinepike

Reputation: 54672

I think you better implement the parsing instead of depending on any ready-made methods.

Here is my suggestion... I am assuming the format of input will be always like followig

(value1+operator+value2)+operator+(value3+operator+value4)+........

[here operator can be different, and + is just showing concatanation).

If the above assumptio is true then you can do the following.

  1. Use a stack
  2. While reading the original string push all the characters into the stack
  3. now popup one by one from the stack by using following logic a. if get ) start adding to a string b. if get ( add to the string and now you get one token. add the token to the array. c. after getting ( skip till the next ).

N.B. it's just and pseudo code with primitive thinking.

Upvotes: 2

Related Questions