Isaac G.
Isaac G.

Reputation: 93

C# split string but keep separators

There already exist similar questions, but all of them use regexen. The code I'm using (that strips the separators):

string[] sentences = s.Split(new string[] { ". ", "? ", "! ", "... " }, StringSplitOptions.None);

I would like to split a block of text on sentence breaks and keep the sentence terminators. I'd like to avoid using regexen for performance. Is it possible?

Upvotes: 3

Views: 2303

Answers (1)

JaredPar
JaredPar

Reputation: 755537

I don't believe there is an existing function that does this. However you can use the following extension method.

public static IEnumerable<string> SplitAndKeepSeparators(this string source, string[] separators) {
  var builder = new Text.StringBuilder();
  foreach (var cur in source) {
    builder.Append(cur);
    if (separators.Contains(cur)) {
      yield return builder.ToString();
      builder.Length = 0;
    }
  }
  if (builder.Length > 0) {
    yield return builder.ToString();
  }
}

Upvotes: 6

Related Questions