Milo Wielondek
Milo Wielondek

Reputation: 4362

String.split's weird behaviour dealing with spaces and tabs

I have a string consisting of tabs and spaces and some arbitrary characters. The string below is made up of space space tab tab 1 space tab -2 tab space + space.

import java.util.Arrays;

String[] s = "          1   -2   + ".split("[\\s]+");
System.out.println(Arrays.toString(s));

Running split with regex [\s+] one would expect to get [1, -2, +], however the returned array I get on my machine (OS X, JDK1.6.0_37) is [, 1, -2, +].

It turns out the first element is simply "blank" (s[0].equals("") returns true) and so it should have been matched by \s.

What am I missing?

Upvotes: 0

Views: 588

Answers (1)

Rohit Jain
Rohit Jain

Reputation: 213321

If while splitting your string, the first character of the string is amongst the delimiter, then the first element of the generated array is always an empty string.

Take it this way, your string always starts with an empty string. So, your delimiter - \s+ will be divide " a" string(note the leading whitespace) in two parts, first before \s+ which is empty string "", and one after it, which is a.

So, the output you got is obvious.

It turns out the first element is simply "blank" (s[0].equals("") returns true) and so it should have been matched by \s.

No it shouldn't have been. A space is not an empty string. There is difference between them.

Upvotes: 2

Related Questions