Reputation: 5927
I need to split a string in C# using a set of delimiter characters. This set should include the default whitespaces (i.e. what you effectively get when you String.Split(null, StringSplitOptions.RemoveEmptyEntries)
) plus some additional characters that I specify like '.', ',', ';', etc. So if I have a char array of those additional characters, how to I add all the default whitespaces to it, in order to then feed that expanded array to String.Split
? Or is there a better way of splitting using my custom delimiter set + whitespaces? Thx
Upvotes: 2
Views: 1018
Reputation: 460078
Just use the appropriate overload of string.Split
if you're at least on .NET 2.0:
char[] separator = new[] { ' ', '.', ',', ';' };
string[] parts = text.Split(separator, StringSplitOptions.RemoveEmptyEntries);
I guess i was downvoted because of the incomplete answer. OP has asked for a way to split by all white-spaces(which are 25 on my pc) but also by other delimiters:
public static class StringExtensions
{
static StringExtensions()
{
var whiteSpaceList = new List<char>();
for (int i = char.MinValue; i <= char.MaxValue; i++)
{
char c = Convert.ToChar(i);
if (char.IsWhiteSpace(c))
{
whiteSpaceList.Add(c);
}
}
WhiteSpaces = whiteSpaceList.ToArray();
}
public static readonly char[] WhiteSpaces;
public static string[] SplitWhiteSpacesAndMore(this string str, IEnumerable<char> otherDeleimiters, StringSplitOptions options = StringSplitOptions.None)
{
var separatorList = new List<char>(WhiteSpaces);
separatorList.AddRange(otherDeleimiters);
return str.Split(separatorList.ToArray(), options);
}
}
Now you can use this extension method in this way:
string str = "word1 word2\tword3.word4,word5;word6";
char[] separator = { '.', ',', ';' };
string[] split = str.SplitWhiteSpacesAndMore(separator, StringSplitOptions.RemoveEmptyEntries);
Upvotes: 3
Reputation: 403
I use something like the following to ensure I'm always splitting on Split's default whitespace characters:
public static string[] SplitOnWhitespaceAnd(this string value,
char[] separator, StringSplitOptions options = StringSplitOptions.RemoveEmptyEntries)
=> value.Split().SelectMany(s => s.Split(separator, options)).ToArray();
Note that to be consistent with Microsoft's naming conventions, you'd want to use WhiteSpace rather than Whitespace.
Refer to Microsoft's Char.IsWhiteSpace documentation to see the whitespace characters split on by default.
Upvotes: 1
Reputation: 21
The answers above do not use all whitespace characters as delimiters, as you state in your request, only the ones specified by the program. In the solution examples above, this is only SPACE, but not TAB, CR, LF, and all the other Unicode-defined whitespace chars.
I have not found a way to retrieve the default whitespace chars from String. However, they are defined in Regex, and you can use that instead of String. In your case, adding period and comma to the Regex whitespace set:
Regex regex = new Regex(@"[\s\.,]+"); // The "+" will remove blank entries
input = @"1.2 3, 4";
string[] tokens = regex.Split(input);
will produce
tokens[0] "1"
tokens[1] "2"
tokens[2] "3"
tokens[3] "4"
Upvotes: 2
Reputation: 18843
string[] splitSentence(string sentence)
{
return sentence
.Replace(",", " , ")
.Replace(".", " . ")
.Split(' ', StringSplitOptions.RemoveEmptyEntries)
}
or
string[] result = test.Split(new string[] {"\n", "\r\n"},
StringSplitOptions.RemoveEmptyEntries);
Upvotes: 0
Reputation: 116108
str.Split(" .,;".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Upvotes: 1