Reputation: 1147
I'm trying to split some text using PowerShell, and I'm doing a little experimenting with regex, and I would like to know exactly what the "|" character does in a PowerShell regex. For example, I have the following line of code:
"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|\]')}
Running this line of code gives me the following output:
-blank line-
02
: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png
If I run the code without the "|" in the -split statement as such:
"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[\]')}
I get the following output without the [] being stripped (essentially it's just displaying the select-string output:
[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png
If I modify the code and run it like this:
"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|')}
In the output, the [
is stripped from the beginning but the output has a carriage return after each character (I did not include the full output for space purposes).
0
2
]
:
.
/
m
e
Upvotes: 0
Views: 187
Reputation: 46710
The answers already explain what the |
is for but I would like to explain what is happening with each example that you have above.
-split '\[|\]'
: You are trying to match either [
or ]
which is why you get 3 results. The first being a blank line which is the whitespace represented by the beginning of the line before the first [
-split '\[\]'
: Since you are omitting the |
symbol in this example you are requesting to split on the character sequence []
which does not appear in your string. This is contrasted by the code $_.split('\[\]')
which would split on every character. This is by design.
-split '\[|'
: Here you are running into a caveat of not specifying the right hand operand for the |
operator. To quote the help from Regex101 when this regex is specified:
(null, matches any position)
Warning: An empty alternative effectively truncates the regex at this point because it will always find a zero-width match
Which is why the last example split on every element. Also, I dont think any of this is PowerShell only. This behavior should be seen on other engines as well.
Upvotes: 1
Reputation: 47802
Walter Mitty is correct, |
is for alternation.
You can also use [Regex]::Escape("string")
in Powershell and it will return a string that has all the special characters escaped. So you can use that on any strings you want to match literally (or to determine if a specific character does or can have special meaning in a regex).
Upvotes: 0
Reputation: 18940
The Pipe character, "|", separates alternatives in regex.
You can see all the metacharacters defined here: http://regexlib.com/CheatSheet.aspx?AspxAutoDetectCookieSupport=1
Upvotes: 2