user3260312
user3260312

Reputation: 241

How to split a string and keep the delimiters?

I know that you saw many questions like mine, but I hope mine is a little bit different. I'm making a translator and I wanted to split a text into sentences but when I've written this code:

public static string[] GetSentences(string Text)
{
    if (Text.Contains(". ") || Text.Contains("? ") || Text.Contains("! "))
        return Text.Split(new string[] { ". ", "? ", "! " }, StringSplitOptions.RemoveEmptyEntries);
    else
        return new string[0];
}

It removed the ".", "?", "!". I want to keep them how can I do it.


NOTE: I want to split by ". " dot and a space, "? " question mark and space...

Upvotes: 3

Views: 3859

Answers (2)

Henk Holterman
Henk Holterman

Reputation: 273314

Simple, replace them first. I'll use the "|" for readability but you may want to use something more exotic.

// this part could be made a little smarter and more flexible.    
// So, just the basic idea:
Text = Text.Replace(". ", ". |").Replace("? ", "? |").Replace("! ", "! |");

if (Text.Contains("|")) 
    return Text.Split('|', StringSplitOptions.RemoveEmptyEntries);

And I wonder about the else return new string[0];, that seems odd. Assuming that when there are no delimiters you want the return the input string, you should just remove the if/else construct.

Upvotes: 16

Ulugbek Umirov
Ulugbek Umirov

Reputation: 12807

Regex way:

return Regex.Split(Text, @"(?<=[.?!])\s+");

So you just split the string by empty spaces preceded by one of ., ? and !.

(?<=[.?!])\s+

Regular expression visualization

Demo

Upvotes: 2

Related Questions