martinK
martinK

Reputation: 103

Powershell matching multiline characters between two strings

I can not work out how to extract everything (multiline) from a log files. here is the sample I need to extract from:

FieldCoilConnectivity=00
ConfigError=readback radio section
NfcErrorCode=0

[compare Errors]

and I need to extract only this part:

readback radio section
NfcErrorCode=0

I am using powershell with this script:

$input_path = ‘C:\Users\martin.kuracka\Desktop\temp\Analyza_chyb_SUEZ_CommTEst\022020\*_E.log’
$output_file = ‘C:\Users\martin.kuracka\Desktop\temp\Analyza_chyb_SUEZ_CommTEst\032020\extracted.txt’
$regex = ‘(?<=ConfigError=)(.*)(?=[compare Errors])’
select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

but it ends up only with this:

readback radio secti

not even full first line is extracted. can you help?

Upvotes: 2

Views: 699

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

There are several issue:

  • You are opening the file in a line-by-line reading mode, you need to read the file in as a single variable (use Get-Content $filepath -Raw)
  • You did not escape [ and the [compare Errors] is treated as a character class that matches a single character from a set (you need \[compare Errors])
  • You need a RegexOptions.Singleline modifier or (?s) inline option to make . match across linebreaks
  • You need to use a non-greedy .*?, not .* to stop at the first occurrence of [compar e Errors]

Use

$regex = '(?s)(?<=ConfigError=).*?(?=\s*\[compare Errors])'
Get-Content $input_path -Raw | Select-String -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

Note I removed the capturing parentheses from around .*? since you are not using the submatch, and I added \s* before \[ to "trim" the resulting match from the trailing whitespace.

Regex details

  • (?s) - singleline mode making . match across lines
  • (?<=ConfigError=) - a location immediately preceded with ConfigError
  • .*? - any 0 or more chars, as few as possible
  • (?=\s*\[compare Errors]) - immediately to the right, there must be 0+ whitespaces followed with [compare Errors].

Upvotes: 5

Related Questions