Steve Martin
Steve Martin

Reputation: 75

Powershell Regex Multiline parsing

I am working on building a script that will analyze a configuration file (cisco switch config) and build a report based on certain findings. Sadly- the findings must be recorded on a specific form so this painful path is my only option outside of manual generation of each form.

What I'm trying to do: Using the following I am attempting to pull the following multi-line expression into PS for evaluation

interface vlan1
no ip address
shutdown
!

I have found multiple sources that point towards one of two options- the first (and simplest) being to load the file into Get-content using the "-raw" switch in order to evaluate the entire file as a single string and then use the "select-string" command to output the specific information that I am looking for.

My basic code looks something like this

if (get-content -path U:\Testing\Test.txt -Raw | select-string -Pattern "(?ms)interface vlan1.*no ip address.*(?!no shutdown)shutdown.*\!" -Quiet)
{
write-host('pass')
}
else
{
write-host('fail')
}

Expected outcome: if the string is true- I will append the finding to a file (that part I have already)

If the configuration does not contain "shutdown" exclusively (without the word no) then it will be annotated as such (again I have that process as well)

Thank you in advance for your assistance- hopefully this is clear and concise.

Further clarity: the script returns false positives/negatives. when running the get-content + select-string outside of the if command- I basically get the -raw output but no match on the string itself, leading me to believe that I am having an issue with the start of line (interface vlan1) and the end line (!)

I have played with the structure of the regex string to try and tease out a solution but I am still a bit vague as to the usage of multi-line output while using select-string.

Upvotes: 2

Views: 177

Answers (1)

mklement0
mklement0

Reputation: 437238

  • Since you need to look at the file in full, there's no reason to use the Select-String cmdlet, given that -match, the regular-expression matching operator, works more effectively on strings that are already in memory.

    • Note: -match only every finds one match (if any); if this is not sufficient, use the [regex]::Matches() .NET method; it is unfortunate that there's no operator for multiple matches; GitHub issue #7867 proposes introducing one, named -matchall.
  • Your regex is too permissive (greedy) due to use of .* across lines due to the (?s) matching option, so matching happens across multiple blocks.

The following uses a regex without .*, and instead explicitly matches the lines in full, including explicit matching of intervening newlines (\r?\n).[1]

This works with your sample input, but you may need to tweak the regex (omitting the (?s) option makes .* match only intra-line; expressions can be made non-greedy by modifying a duplication symbol with ? (e.g. .*?)).

$re = '(?m)^interface vlan1\r?\nno ip address\r?\n(?!no shutdown)shutdown\r?\n!'
if ((Get-Content U:\Testing\Test.txt -Raw) -match $re) {
 # ... 
}

Note: The assumption is that there's no need to validate that the trailing ! is the only character on its line; if that is needed, append (?:\r?\n|\z).[2]


[1] This regex matches both common newline formats: CRLF (\r\n, Windows) and LF (\n, Unix).

[2] Unfortunately, use of $ to assert the end of a line (with the (?m) option in effect) may not work if the input uses CRLF (\r\n) newlines, because the $ matches the position of a LF character (\n) only, which means that $ does not match immediately after !, due to the intervening \r.

Upvotes: 3

Related Questions