DBS
DBS

Reputation: 1147

Usage of | in PowerShell regex

I'm trying to split some text using PowerShell, and I'm doing a little experimenting with regex, and I would like to know exactly what the "|" character does in a PowerShell regex. For example, I have the following line of code:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|\]')}

Running this line of code gives me the following output:

-blank line-
02
: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png

If I run the code without the "|" in the -split statement as such:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[\]')}

I get the following output without the [] being stripped (essentially it's just displaying the select-string output:

[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png

If I modify the code and run it like this:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|')}

In the output, the [ is stripped from the beginning but the output has a carriage return after each character (I did not include the full output for space purposes).

0
2
]
:

.
/
m
e

Upvotes: 0

Views: 187

Answers (3)

Matt
Matt

Reputation: 46710

The answers already explain what the | is for but I would like to explain what is happening with each example that you have above.

  1. -split '\[|\]': You are trying to match either [ or ] which is why you get 3 results. The first being a blank line which is the whitespace represented by the beginning of the line before the first [

  2. -split '\[\]': Since you are omitting the | symbol in this example you are requesting to split on the character sequence [] which does not appear in your string. This is contrasted by the code $_.split('\[\]') which would split on every character. This is by design.

  3. -split '\[|': Here you are running into a caveat of not specifying the right hand operand for the | operator. To quote the help from Regex101 when this regex is specified:

(null, matches any position)

Warning: An empty alternative effectively truncates the regex at this point because it will always find a zero-width match

Which is why the last example split on every element. Also, I dont think any of this is PowerShell only. This behavior should be seen on other engines as well.

Upvotes: 1

briantist
briantist

Reputation: 47802

Walter Mitty is correct, | is for alternation.

You can also use [Regex]::Escape("string") in Powershell and it will return a string that has all the special characters escaped. So you can use that on any strings you want to match literally (or to determine if a specific character does or can have special meaning in a regex).

Upvotes: 0

Walter Mitty
Walter Mitty

Reputation: 18940

The Pipe character, "|", separates alternatives in regex.

You can see all the metacharacters defined here: http://regexlib.com/CheatSheet.aspx?AspxAutoDetectCookieSupport=1

Upvotes: 2

Related Questions