Reputation: 101681
I noticed that if I Split
a string by white-space that contains only white-spaces, it returns unexpected result.Consider this:
var spaces = string.Join("",Enumerable.Repeat(" ", 10));
int lenght = spaces.Length; // 10
var result = spaces.Split(' ');
lenght = result.Length; // 11
I couldn't figure out why result.Length
returns 11
empty strings. while I have 10
spaces in my input string ? I also tried it with a letter for example "a"
and that doesn't make any difference:
var letters = string.Join("",Enumerable.Repeat("a", 10));
int lenght = letters.Length; // 10
var result = letters.Split('a');
lenght = result.Length; // 11
In the documentation it says:
If two delimiters are adjacent, or a delimiter is found at the beginning or end of this instance, the corresponding array element contains Empty.
So I understand why I'm getting empty strings but I don't understand where is that extra element coming from?
There is an example in the documentation:
var input = "42..12..19";
var result = input.Split('.');
That returns five result and two of them are empty strings.Not three.
So is this the default and expected behaviour, or is it a bug or something?
Upvotes: 1
Views: 327
Reputation: 17213
Consider this:
"1 2 3 4 5 6 7 8 9 10 11"
There are 10 spaces in the above, and 11 numbers. Each space separates the previous number from the next. The resulting array will have the same length if you remove the numbers. This is expected.
In your example, the beginning of the string is an element, up to the first delimiter. Since a delimiter is the first character, the first element of the array is empty. Afterwards, there is an empty array item added for each additional space.
Upvotes: 2
Reputation: 88054
Not a bug and totally expected behavior.
Look at it this way:
1-2-3
split on the -
. This leads to 3 elements: 1,2 and 3.
Now take --3
and split on the dash again. Also 3 elements with the first 2 being empty.
A delimiter is essentially an element that is between two other elements. The elements it is between can be empty. So if you have 10 spaces and are splitting on spaces then you will always have 11 elements.
Your last example with "42..12..19"
being split on .
is essentially: 42.EMPTY.12.EMPTY.19
Which is 5 elements.
Upvotes: 8
Reputation: 532435
It's matching an empty element after the last space. In your last example, place a .
at the end of the string and you'll get 6 elements even though you only have 5 separators. In fact, just look at that example - there are 5 elements but only 4 separators. In general, you'll always have one more element than the number of separators because there will be an element before each separator and one after the last one.
Upvotes: 2