light270
light270

Reputation: 33

Powershell Regex - Capture Multiple Strings

I've been at this all day and still trying to understand regex. Any help would be appreciated.

Basically, what I'm trying to do is capture sentences in a string. example:

Criteria: this is the first criteria. blah blah blah blah Criteria: this is the second criteria.

I can use the regex statement of $string -match "Criteria(?<data>.*)" and it will match up to the first match, and all the rest with the .* but i would like to be able to group the two as to reference them later. also, this is in PowerShell thanks!

Upvotes: 2

Views: 2247

Answers (1)

AdminOfThings
AdminOfThings

Reputation: 25021

As requested, the comment converted to answer:

In PowerShell (and most regex engines), () is a grouping mechanism that creates a capture group. If you use the syntax (?<name>), the capture group will be named name. Otherwise, each set of parentheses that has a successful match is assigned a capture group number. The capture group numbering starts with 1 and increments upward by 1. The leftmost open ( encompasses capture group 1 if there is a match. The second open ( encompasses capture group 2 if there is a match. This pattern continues until there are no more groupings. Capture group 0 is the entire match.

-match is the PowerShell regex matching operator. By default, it does a single match and won't return multiple, complete matches. So once the entire regex string has been matched successfully, it will return to the prompt. If the left-hand side (LHS) of -match is a single string, then -match will return True if the match is successful. Otherwise, False is returned. When there is a successful match, the $matches automatic variable is updated with the match contents. $matches is a hash table with keys that are named the same as the capture group name.

If the LHS of the -match operator is an array, then successful matches will return the array elements that contain the successful match.

Examples

# Example: Successful Match Stops Parsing
'hi hey howdy' -match 'h\w+'
$matches.0 # returns hi since hi matched the complete regex string. it won't keep matching hey and howdy

# Example: Unnamed Capture Groups
'hi hey howdy' -match '(hi).*(ho.*)'
$matches.1 # returns hi
$matches.2 # returns howdy
$matches.0 # returns hi hey howdy

# Example: Unnamed Capture Groups With Nested Parentheses
'hi hey howdy' -match '((hi).*)(ho.*)'
$matches.1 # returns hi hey 
$matches.2 # returns hi
$matches.3 # returns howdy

# Example: Mixing named and unnamed capture groups
'hi hey howdy' -match '(?<first>hi)(.*)'
$matches.first # returns hi
$matches.1 # returns  hey howdy. capture group 1 is used because unnamed group numbering starts at 1

# Example: Capture group 1 is null and will not be added as a key to $matches!
'hi hey howdy' -match '(hello)*(hey)'
$matches.1 # returns nothing or an error on strict mode 2 or higher
$matches.2 # returns hey

# Example: LHS is an array
# returns array element 1: hello
'array element 0: hi hey howdy','array element 1: hello' -match 'hel'

Note that $matches is not updated or erased when there is an unsuccessful match. $matches is also not updated when the LHS is an array. You can also use conditional statements to update $matches as well. See below:

if ('hi hey howdy' -match 'hi') {
    $matches.0 # returns hi since the if statement is true
}
if ('hi hey howdy' -match 'z') {
    $matches.0 # returns nothing since the if statement is false
}
$matches.0 # returns hi because last -match attempt was false!

Upvotes: 2

Related Questions