LeeM
LeeM

Reputation: 1258

Powershell split() vs -split - what's the difference?

After struggling with this for half an hour, I've experienced this difference when splitting a string with spaces, depending on which syntax you use.

Simple string:

$line = "1: 2: 3: 4: 5: "

Split example 1 - notice the extra space with tokens from 1 onwards:

$ln = $line.split(":\s+")
$ln
1
 2
 3
 4
 5

Split example 2 - the spaces are gone (as they should)

$ln = $line -split ":\s+"
$ln
1
2
3
4
5

I suspect it's because the first one is a .NET method (?), and the -split PS operator perhaps has more baked-in smarts when it comes to regex interpretation.

However, when I tried the first method with the split like ": ", that didn't work properly either. If it's .NET, does it need something to correctly interpret the fact it should be using both characters as a delimiter?

Upvotes: 3

Views: 4435

Answers (4)

js2010
js2010

Reputation: 27516

Trying some different overloads of .split(). They're all case sensitive. Sometimes the seperator is a character array, and sometimes it's a string, depending on the best match for the arguments.

'aABa'.split

OverloadDefinitions
-------------------
string[] Split(Params char[] separator)
string[] Split(char[] separator, int count)
string[] Split(char[] separator, System.StringSplitOptions options)
string[] Split(char[] separator, int count, System.StringSplitOptions options)
string[] Split(string[] separator, System.StringSplitOptions options)
string[] Split(string[] separator, int count, System.StringSplitOptions options)
'aABa'.split(@('AB'),'none')  # string separator

a
a


'aABa'.split('AB') # character array separator with one empty result

a

a


'aABa'.split(@('A','B'),'removeemptyentries')  # char array separator with option

a
a


'aABa'.split(@('A','B'),1,'removeemptyentries') # count of 1

aABa

Upvotes: 1

Joseph Alcorn
Joseph Alcorn

Reputation: 2442

The .Net System.String.Split method does not have an overload that takes a single parameter that is a string. It also does not understand regex.

What is happening is that powershell is taking the string you are passing in and converting it to an array of characters. It is essentially splitting at the following characters :, \, s, +

When you use ": " as the delimiter, I would imagine you got results like

1

2

3

4

5

That is because without specifying a string split option to the .Net method, it will include empty strings that it finds between adjacent separators.

Upvotes: 4

Csharp Guy
Csharp Guy

Reputation: 115

  1. .Net object does not take regex but interprets it as a literal string.

    $line = "1: 2: 3: 4: 5: "
    $ln = $line.split("\s")
    $ln
    

Output:

1: 2: 3: 4: 5: 

Hence, in your example, the "+" is ignored as it it not found in $ln but ":\s" is used for splitting:

$ln = $line.split(":\s+")
$ln

Output: 

1
 2
 3
 4
 5

which is same as

$ln = $line.split(":")
$ln

2.while powershell -split operator interprets "\s" as a valid regex i.e., space. Since, it can find both : and \s in the string, the combination of ":\s" is used for splitting. eg:

$line = "1:2:3: 4: 5: "
$ln = $line -split ":"
$ln

output:

1
2
3
 4
 5

Upvotes: 1

mjolinor
mjolinor

Reputation: 68321

The string split method takes a character array argument (not a string). If you specify multiple characters it will split on any instance of any of those characters.

Upvotes: 0

Related Questions