Anthony Stringer
Anthony Stringer

Reputation: 2001

optional regex capture group - what am i missing?

this is the code I'm testing

$counter = 0
@'
    1.1.1.1 (IMGTBCCWPRTIE34)
    2.2.2.2 (CMRI58672304 INC02394875 - fj)
'@.Split("`n") | % {
    $counter++
    if ($_ -match '((?:\d{1,3}\.){3}\d{1,3}(?:-\d+)?).*?((?:IR |INC)\d+)?') {
        [pscustomobject]@{
            Line = $counter
            IP = $Matches[1]
            Number = $Matches[2]
        }
    }
}

which gives me this, and i don't know why there is no number for the second line

Line IP      Number
---- --      ------
   1 1.1.1.1       
   2 2.2.2.2     

and, of course, if i make the last part mandatory by removing the final ?, the non matching line captures nothing

$counter = 0
@'
    1.1.1.1 (IMGTBCCWPRTIE34)
    2.2.2.2 (CMRI58672304 INC02394875 - fj)
'@.Split("`n") | % {
    $counter++
    if ($_ -match '((?:\d{1,3}\.){3}\d{1,3}(?:-\d+)?).*?((?:IR |INC)\d+)') {
        [pscustomobject]@{
            Line = $counter
            IP = $Matches[1]
            Number = $Matches[2]
        }
    }
}

gives me this

Line IP      Number     
---- --      ------     
   2 2.2.2.2 INC02394875

this works, but id like the regex to be all one one line

$counter = 0
@'
    1.1.1.1 (IMGTBCCWPRTIE34)
    2.2.2.2 (CMRI58672304 INC02394875 - fj)
'@.Split("`n") | % {
    $counter++
    if ($_ -match '((?:\d{1,3}\.){3}\d{1,3}(?:-\d+)?)') {
        $ip = $Matches[1].Trim()
        if ($_ -match '((?:IR |INC)\d+)') {
            $number = $Matches[1].Trim()
        } else {
            $number = $null
        }
        [pscustomobject]@{
            Line = $counter
            IP = $ip
            Number = $number
        }
    }
}

and it gives me my desired result, but I'm not sure how to get here with just one regex.

Line IP      Number
---- --      ------
   1 1.1.1.1               
   2 2.2.2.2 INC02394875  

any help would be greatly appreciated

here is where I'm testing

https://regex101.com/r/cP9wF2/1

Upvotes: 2

Views: 509

Answers (1)

briantist
briantist

Reputation: 47832

((?:\d{1,3}\.){3}\d{1,3}(?:-\d+)?)(?:.*?((?:IR |INC)\d+))?

Regular expression visualization

Debuggex Demo

The non-greedy modifier on .*? will cause it to stop matching as soon as it matches the space I think, assuming there's no later match.

So instead, we surround that whole second part from .*? through the optional capture group at the end, in a non-capturing group, and make that optional, while inside that, the (previously optional) capturing group is mandatory.

Upvotes: 4

Related Questions