Reputation:
I'm trying to keep a parenthese within a string that's surrounded by a parenthese.
The string in question is: test (blue,(hmmm) derp)
The desired output into an array is: test
and (blue,(hmmm) derp)
.
The current output is: (blue,
, (hmm)
and derp)
.
My current code is thatof this:
var input = Regex
.Split(line, @"(\([^()]*\))")
.Where(s => !string.IsNullOrEmpty(s))
.ToList();
How can i extract the text inside the outside parentheses (keeping them) and keep the inside parenthese as one string in an array?
EDIT:
To clarify my question, I want to ignore the inner parentheses and only split on the outer parentheses.
herpdediderp (orange,(hmm)) some other crap (red,hmm)
Should become:
herpdediderp
, orange,(hmm)
, some other crap
and red,hmm
.
The code works for everything except the double parentheses: (orange,(hmm))
to orange,(hmm)
.
Upvotes: 2
Views: 745
Reputation: 8332
Lot's o' guessing going on here - from me and the others. You could try
[^(]+|\([^(]*(?:\([^(]*\)[^(]*)*\)
It handles one level of parentheses recursion (could be extended though).
Visual illustration at regex101.
If this piques your interest, I'll add an explanation ;)
Edit:
If you need to use split, put the selection in to a group, like
([^(]+|\([^(]*(?:\([^(]*\)[^(]*)*\))
and filter out empty strings. See example here at ideone.
Edit 2:
Not quite sure what behaviour you want with multiple levels of parentheses, but I assume this could do it for you:
([^(]+|\([^(]*(?:\([^(]*(?:\([^(]*\)[^(]*)*\)[^(]*)*\))
^^^^^^^^^^^^^^^^^^^ added
For each level of recursion you want, you "just" add another inner level. So this is for two levels of recursion ;)
Upvotes: 0
Reputation: 26917
I think if you think about the problem backwards, it becomes a bit easier - don't split on what you don't what, extract what you do want.
The only slightly tricky part if matching nested parentheses, I assume you will only go one level deep.
The first example:
var s1 = "(blue, (hmmm) derp)";
var input = Regex.Matches(s1, @"\((?:\(.+?\)|[^()]+)+\)").Cast<Match>().Select(m => Regex.Matches(m.Value, @"\(\w+\)|\w+").Cast<Match>().Select(m2 => m2.Value).ToArray()).ToArray();
// input is string[][] { string[] { "blue", "(hmmm)", "derp" } }
The second example uses an extension method:
public static string TrimOutside(this string src, string openDelims, string closeDelims) {
if (!String.IsNullOrEmpty(src)) {
var openIndex = openDelims.IndexOf(src[0]);
if (openIndex >= 0 && src.EndsWith(closeDelims.Substring(openIndex, 1)))
src = src.Substring(1, src.Length - 2);
}
return src;
}
The code/patterns are different because the two examples are being handled differently:
var s2 = "herpdediderp (orange,(hmm)) some other crap (red,hmm)";
var input3 = Regex.Matches(s2, @"\w(?:\w| )+\w|\((?:[^(]+|\([^)]+\))+\)").Cast<Match>().Select(m => m.Value.TrimOutside("(",")")).ToArray();
// input2 is string[] { "herpdediderp", "orange,(hmm)", "some other crap", "red,hmm" }
Upvotes: 0
Reputation: 52240
Hopefully someone will come up with a regex. Here's my code answer.
static class ExtensionMethods
{
static public IEnumerable<string> GetStuffInsideParentheses(this IEnumerable<char> input)
{
int levels = 0;
var current = new Queue<char>();
foreach (char c in input)
{
if (levels == 0)
{
if (c == '(') levels++;
continue;
}
if (c == ')')
{
levels--;
if (levels == 0)
{
yield return new string(current.ToArray());
current.Clear();
continue;
}
}
if (c == '(')
{
levels++;
}
current.Enqueue(c);
}
}
}
Test program:
public class Program
{
public static void Main()
{
var input = new []
{
"(blue,(hmmm) derp)",
"herpdediderp (orange,(hmm)) some other crap (red,hmm)"
};
foreach ( var s in input )
{
var output = s.GetStuffInsideParentheses();
foreach ( var o in output )
{
Console.WriteLine(o);
}
Console.WriteLine();
}
}
}
Output:
blue,(hmmm) derp
orange,(hmm)
red,hmm
Upvotes: 1
Reputation: 112324
You can use the method
public string Trim(params char[] trimChars)
Like this
string trimmedLine = line.Trim('(', ')'); // Specify undesired leading and trailing chars.
// Specify separator characters for the split (here command and space):
string[] input = trimmedLine.Split(new[]{',', ' '}, StringSplitOptions.RemoveEmptyEntries);
If the line can start or end with 2 consecutive parentheses, use simply good old if-statements:
if (line.StartsWith("(")) {
line = line.Substring(1);
}
if (line.EndsWith(")")) {
line = line.Substring(0, line.Length - 1);
}
string[] input = line.Split(new[]{',', ' '},
Upvotes: 4