Reputation: 103
I can not work out how to extract everything (multiline) from a log files. here is the sample I need to extract from:
FieldCoilConnectivity=00
ConfigError=readback radio section
NfcErrorCode=0
[compare Errors]
and I need to extract only this part:
readback radio section
NfcErrorCode=0
I am using powershell with this script:
$input_path = ‘C:\Users\martin.kuracka\Desktop\temp\Analyza_chyb_SUEZ_CommTEst\022020\*_E.log’
$output_file = ‘C:\Users\martin.kuracka\Desktop\temp\Analyza_chyb_SUEZ_CommTEst\032020\extracted.txt’
$regex = ‘(?<=ConfigError=)(.*)(?=[compare Errors])’
select-string -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file
but it ends up only with this:
readback radio secti
not even full first line is extracted. can you help?
Upvotes: 2
Views: 699
Reputation: 626748
There are several issue:
Get-Content $filepath -Raw
)[
and the [compare Errors]
is treated as a character class that matches a single character from a set (you need \[compare Errors]
)RegexOptions.Singleline
modifier or (?s)
inline option to make .
match across linebreaks.*?
, not .*
to stop at the first occurrence of [compar e Errors]
Use
$regex = '(?s)(?<=ConfigError=).*?(?=\s*\[compare Errors])'
Get-Content $input_path -Raw | Select-String -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file
Note I removed the capturing parentheses from around .*?
since you are not using the submatch, and I added \s*
before \[
to "trim" the resulting match from the trailing whitespace.
Regex details
(?s)
- singleline mode making .
match across lines(?<=ConfigError=)
- a location immediately preceded with ConfigError
.*?
- any 0 or more chars, as few as possible(?=\s*\[compare Errors])
- immediately to the right, there must be 0+ whitespaces followed with [compare Errors]
.Upvotes: 5