Schwert im Stein
Schwert im Stein

Reputation: 43

Capturing group not working at end of -Pattern for Select-String

I've recently started working with regex in Powershell and have come across an unexpected response from the Select-String cmdlet.

If you enter something like the following:

$thing = "135" | Select-String -Pattern "(.*?)5"
$thing.Matches

You receive the expected result from the Match-Info object:

Groups   : {135, 13}
Success  : True
Captures : {135}
Index    : 0
Length   : 3
Value    : 135

But if you place the capturing group at the end of the -Pattern:

$thing = "135" | Select-String -Pattern "(.*?)"
$thing.Matches

The Match-Info doesn't seem to find anything, although one is created:

Groups   : {, }
Success  : True
Captures : {}
Index    : 0
Length   : 0
Value    : 

As I said, I'm quite new to Powershell, so I expect this behavior is operator error.

But what is the work around? This behavior hasn't caused me problems yet, but considering the files I'm working with (electronic manuals contained in XML files), I expect it will eventually.

...

With regards,

Schwert

...

Clarification:

I made my example very simple to illustrate the behavior, but my original issue was with this pattern:

$linkname = $line | Select-String -Pattern "`"na`"><!--(?<linkname>.*?)"

The file is one of our indices for the links between manuals, and the name of the link is contained within a comment block located on each line of the file.

The pattern is actually a typo, as the name and the comment don't go all the way to the end of the line. I found it when the program began giving errors when it couldn't find "linkname" in the Match-Info object.

Once I gave it the characters which occur after the link name (::), then it worked correctly. Putting it into the example:

$linkname = $line | Select-String -Pattern "`"na`"><!--(?<linkname>.*?)::"

Upvotes: 2

Views: 424

Answers (1)

Adam Bertram
Adam Bertram

Reputation: 4178

I'm no regex expert but I believe your pattern "(.*?)" is the problem. If you remove the ?, for example, you get the groups as expected.

Also, PLEASE don't use regex to parse XML. :) There's much easier ways to do that such as:

[xml]$Manual = Get-Content -Path C:\manual.xml

or

$xdoc = New-Object System.Xml.XmlDocument
$file = Resolve-Path C:\manual.xml
$xdoc.Load($file)

Once you've got it in a structured format you can then use dot notation or XPath to navigate the nodes and attributes.

Upvotes: 3

Related Questions