Sergio
Sergio

Reputation: 323

PowerShell regex does not match near newline

I have an exe output in form

Compression          : CCITT Group 4
Width                : 3180

and try to extract CCITT Group 4 to $var with PowerShell script

$var = [regex]::match($exeoutput,'Compression\s+:\s+([\w\s]+)(?=\n)').Groups[1].Value

The http://regexstorm.net/tester say, the regexp Compression\s+:\s+([\w\s]+)(?=\n) is correct but not PowerShell. PowerShell does not match. How can I write the regexp correctly?

Upvotes: 1

Views: 1579

Answers (2)

mklement0
mklement0

Reputation: 437353

Wiktor Stribiżew's helpful answer simplifies your regex and shows you how to use PowerShell's
-match operator as an alternative.

Your follow-up comment about piping to Out-String fixing your problem implies that your problem was that $exeOutput contained an array of lines rather than a single, multiline string.

This is indeed what happens when you capture the output from a call to an external program (*.exe): PowerShell captures the stdout output lines as an array of strings (the lines without their trailing newline).

As an alternative to converting array $exeOutput to a single, multiline string with Out-String (which, incidentally, is slow[1]), you can use a switch statement to operate on the array directly:

# Stores 'CCITT Group 4' in $var
$var = switch -regex ($exeOutput) { 'Compression\s+:\s+(.+)' { $Matches[1]; break } }

Alternatively, given the specific format of the lines in $exeOutput, you could leverage the
ConvertFrom-StringData cmdlet, which can perform parsing the lines into key-value pairs for you, after having replaced the : separator with =:

$var = ($exeoutput -replace ':', '=' | ConvertFrom-StringData).Compression

[1] Use of a cmdlet is generally slower than using an expression; with a string array $array as input, you can achieve what $array | Out-String does more efficiently with $array -join "`n", though note that Out-String also appends a trailing newline.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

You want to get all text from some specific pattern till the end of the line. So, you do not even need the lookahead (?=\n), just use .+, because . matches any char but a newline (LF) char:

$var = [regex]::match($exeoutput,'Compression\s+:\s+(.+)').Groups[1].Value

Or, you may use a -match operator and after the match is found access the captured value using $matches[1]:

$exeoutput -match 'Compression\s*:\s*(.+)'
$var = $matches[1]

Upvotes: 2

Related Questions