Reputation: 93
There already exist similar questions, but all of them use regexen. The code I'm using (that strips the separators):
string[] sentences = s.Split(new string[] { ". ", "? ", "! ", "... " }, StringSplitOptions.None);
I would like to split a block of text on sentence breaks and keep the sentence terminators. I'd like to avoid using regexen for performance. Is it possible?
Upvotes: 3
Views: 2303
Reputation: 755537
I don't believe there is an existing function that does this. However you can use the following extension method.
public static IEnumerable<string> SplitAndKeepSeparators(this string source, string[] separators) {
var builder = new Text.StringBuilder();
foreach (var cur in source) {
builder.Append(cur);
if (separators.Contains(cur)) {
yield return builder.ToString();
builder.Length = 0;
}
}
if (builder.Length > 0) {
yield return builder.ToString();
}
}
Upvotes: 6