Aviv Cohn
Aviv Cohn

Reputation: 17173

How can I split a string on separators, and keep the separators as elements in the resulting array?

This question has been asked on this site in two ways: first way was actually a different question; the OP wanted to keep the separators at the end/beginning of the resulting elements. The second question received an answer that included a Regex that I couldn't understand how to expand.

I'm looking for a way to split a string on separators, where the separators will be included as elements in the resulting array. Please explain how I can choose custom separators for this function (most possibly a Regex).

Upvotes: 1

Views: 85

Answers (2)

bytecode77
bytecode77

Reputation: 14820

You could still use string.Split and keep the separators in an array. See this extension method:

public static class StringExtensions
{
    public static string[] SplitAndKeep(this string s, string[] seperators)
    {
        string[] obj = s.Split(seperators, StringSplitOptions.None);
        List<string> result = new List<string>(obj.Length * 2 - 1);

        for (int i = 0; i < obj.Length; i++)
        {
            result.Add(obj[i]);
            if (i < obj.Length - 1) result.Add(separator);
        }
        return result.ToArray();
    }
}

Upvotes: 0

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

You can use a regex that specifies an empty expression with a lookahead or a lookbehind.

For example, let's say that you wish to split your string on any of these characters:

'(' ' ' ',' ')' '*' '/' '+' '-'

Then you can construct an expression that lists them in a lookahead or a lookbehind clause (i.e. (?=...) or (?<=...)), and split using Regex.Split, like this:

string input = "12 / 34+(45-56)*678";
string pattern = "(?=[( ,)*/+-])|(?<=[( ,)*/+-])";

string[] substrings = Regex.Split(input, pattern);
foreach (string match in substrings) {
    Console.WriteLine("'{0}'", match);
}

Demo.

Running this produces the following output:

'12'
' '
'/'
' '
'34'
'+'
'('
'45'
'-'
'56'
')'
'*'
'678'

Note how all separators are included, along with spaces, parentheses, and operators.

Upvotes: 4

Related Questions