user5659664
user5659664

Reputation:

How to get parentheses inside parentheses

I'm trying to keep a parenthese within a string that's surrounded by a parenthese.

The string in question is: test (blue,(hmmm) derp)

The desired output into an array is: test and (blue,(hmmm) derp).

The current output is: (blue,, (hmm) and derp).

My current code is thatof this:

var input = Regex
  .Split(line, @"(\([^()]*\))")
  .Where(s => !string.IsNullOrEmpty(s))
  .ToList();

How can i extract the text inside the outside parentheses (keeping them) and keep the inside parenthese as one string in an array?

EDIT:

To clarify my question, I want to ignore the inner parentheses and only split on the outer parentheses.

herpdediderp (orange,(hmm)) some other crap (red,hmm)

Should become:

herpdediderp, orange,(hmm), some other crap and red,hmm.

The code works for everything except the double parentheses: (orange,(hmm)) to orange,(hmm).

Upvotes: 2

Views: 745

Answers (4)

SamWhan
SamWhan

Reputation: 8332

Lot's o' guessing going on here - from me and the others. You could try

[^(]+|\([^(]*(?:\([^(]*\)[^(]*)*\)

It handles one level of parentheses recursion (could be extended though).

Here at regexstorm.

Visual illustration at regex101.

If this piques your interest, I'll add an explanation ;)

Edit:

If you need to use split, put the selection in to a group, like

([^(]+|\([^(]*(?:\([^(]*\)[^(]*)*\))

and filter out empty strings. See example here at ideone.

Edit 2:

Not quite sure what behaviour you want with multiple levels of parentheses, but I assume this could do it for you:

([^(]+|\([^(]*(?:\([^(]*(?:\([^(]*\)[^(]*)*\)[^(]*)*\))
                        ^^^^^^^^^^^^^^^^^^^ added

For each level of recursion you want, you "just" add another inner level. So this is for two levels of recursion ;)

See it here at ideone.

Upvotes: 0

NetMage
NetMage

Reputation: 26917

I think if you think about the problem backwards, it becomes a bit easier - don't split on what you don't what, extract what you do want.

The only slightly tricky part if matching nested parentheses, I assume you will only go one level deep.

The first example:

var s1 = "(blue, (hmmm) derp)";
var input = Regex.Matches(s1, @"\((?:\(.+?\)|[^()]+)+\)").Cast<Match>().Select(m => Regex.Matches(m.Value, @"\(\w+\)|\w+").Cast<Match>().Select(m2 => m2.Value).ToArray()).ToArray();
// input is string[][] { string[] { "blue", "(hmmm)", "derp" } }

The second example uses an extension method:

public static string TrimOutside(this string src, string openDelims, string closeDelims) {
    if (!String.IsNullOrEmpty(src)) {
        var openIndex = openDelims.IndexOf(src[0]);
        if (openIndex >= 0 && src.EndsWith(closeDelims.Substring(openIndex, 1)))
            src = src.Substring(1, src.Length - 2);
    }
    return src;
}

The code/patterns are different because the two examples are being handled differently:

var s2 = "herpdediderp (orange,(hmm)) some other crap (red,hmm)";
var input3 = Regex.Matches(s2, @"\w(?:\w| )+\w|\((?:[^(]+|\([^)]+\))+\)").Cast<Match>().Select(m => m.Value.TrimOutside("(",")")).ToArray();
// input2 is string[] { "herpdediderp", "orange,(hmm)", "some other crap", "red,hmm" }

Upvotes: 0

John Wu
John Wu

Reputation: 52240

Hopefully someone will come up with a regex. Here's my code answer.

static class ExtensionMethods
{
    static public IEnumerable<string> GetStuffInsideParentheses(this IEnumerable<char> input)
    {
        int levels = 0;
        var current = new Queue<char>();
        foreach (char c in input)
        {
            if (levels == 0)
            {
                if (c == '(') levels++;
                continue;
            }
            if (c == ')')
            {
                levels--; 
                if (levels == 0)
                { 
                    yield return new string(current.ToArray()); 
                    current.Clear();
                    continue;
                }
            }
            if (c == '(')
            {
                levels++; 
            }
            current.Enqueue(c); 
        }
    }
}

Test program:

public class Program
{
    public static void Main()
    {

        var input = new []
        {
            "(blue,(hmmm) derp)", 
            "herpdediderp (orange,(hmm)) some other crap (red,hmm)"
        };

        foreach ( var s in input )
        {
            var output = s.GetStuffInsideParentheses();
            foreach ( var o in output )
            {
                Console.WriteLine(o);
            }
            Console.WriteLine();
        }
    }
}

Output:

blue,(hmmm) derp

orange,(hmm)
red,hmm

Code on DotNetFiddle

Upvotes: 1

Olivier Jacot-Descombes
Olivier Jacot-Descombes

Reputation: 112324

You can use the method

public string Trim(params char[] trimChars)

Like this

string trimmedLine = line.Trim('(', ')'); // Specify undesired leading and trailing chars.

// Specify separator characters for the split (here command and space):
string[] input = trimmedLine.Split(new[]{',', ' '}, StringSplitOptions.RemoveEmptyEntries);

If the line can start or end with 2 consecutive parentheses, use simply good old if-statements:

if (line.StartsWith("(")) {
    line = line.Substring(1);
}
if (line.EndsWith(")")) {
    line = line.Substring(0, line.Length - 1);
}
string[] input = line.Split(new[]{',', ' '}, 

Upvotes: 4

Related Questions