Paul H
Paul H

Reputation: 13

Pull out specific chunk from middle of logfile with powershell

I am trying to write a powershell script to parse some text from apache tomcat logs, and give me the useful error message.

Here is an example log, scrubbed a bit:

03-Aug-2021 11:42:36.445 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version:        Apache Tomcat/10.11.12
[stuff]
03-Aug-2021 11:43:06.832 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.something.or.other.classname]
[stuff]
03-Aug-2021 11:43:06.848 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [com.something.or.other.classname]
[stuff]
03-Aug-2021 11:48:23.176 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version:        Apache Tomcat/10.11.12
[stuff]
03-Aug-2021 11:48:54.482 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.something.or.other.classname]
[stuff]
03-Aug-2021 11:48:54.498 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [com.something.or.other.classname]
[stuff]
EOF

My desired output is:

03-Aug-2021 11:48:54.482 SEVERE [localhost-startStop-2] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.something.or.other.classname]
[stuff]

I have an awk one-liner that I got some help making to accomplish this:

awk '/Server version/{ chunk="" } /SEVERE/{ logme=(chunk=="") } logme{ chunk=chunk $0 RS } END{ printf "%s", chunk }' logfile.txt

And instructions for people to use less to find the same info:

less logfile.txt
G
?Server\ version
/SEVERE
[read line including first match for SEVERE from this point, stop reading when get to next instance of SEVERE]

Here is the not-quite-working powershell script that I have ended up with thus far:

$fileContents = Get-Content "X:\full\path\to\logfile.txt"
$lineNum = (($fileContents | Select-String 'Server\ version')[-1].LineNumber) - 1
$laststartup = ( $fileContents | Select -Skip $lineNum )
$lineNum2 = (($laststartup | Select-String 'SEVERE')[-1].LineNumber) - 1
$laststartup | Select -Skip $lineNum2 | more

And here is some garbage I had half written to try to fully get the output I wanted:

$fileContents = Get-Content "X:\full\path\to\logfile.txt"
$lineNum = (($fileContents | Select-String 'Server version')[-1].LineNumber) - 1
$laststartup = ( $fileContents | Select -Skip $lineNum )
$lineNum2 = (($laststartup | Select-String 'SEVERE')[-1].LineNumber) - 1
$lineNum3 = (($laststartup | Select-String 'SEVERE')[-1].LineNumber)
$AfterFirstSevere = ( $laststartup | Select -Skip $lineNum3 )
$lineNum4 = (($AfterFirstSevere | Select-String 'SEVERE')[-1].LineNumber) - 1

But I couldn't quite tweak it to work. I just want to find the last instance of "Server version" in the logfile.txt (which is the most recent startup) and then after that point, find the first instance of SEVERE (the first failure) and print that line, and all the lines after it, and stop at the 2nd SEVERE (and dont print it).

Any help would be appreciated.

Upvotes: 1

Views: 69

Answers (2)

mklement0
mklement0

Reputation: 437111

While not syntax-compatible, at a conceptual level PowerShell's switch statement has similarities with awk and its -File parameter allows for much faster line-by-line processing of a text file than combining Get-Content with ForEach-Object.

The following collects all lines of interest in an array[1] (rather than a single, multi-line string):

$linesOfInterest= @(); $serverLineFound = $false; $severeCount = 0

switch -Regex -CaseSensitive -File logfile.txt {
  '\bServer version\b' { 
      $serverLineFound = $true; $severeCount = 0
      continue 
    }
  '\bSEVERE\b' {
      if (-not $serverLineFound -or ++$severeCount -gt 1) { continue }
      $linesOfInterest = @($_)
      continue
    }
  default {
      if ($severeCount -eq 1) { $linesOfInterest += $_ }
    }
}

# Output the captured array of lines.
$linesOfInterest

[1] For syntactical convenience the array is "extended" with +=, which technically means creating a new array every time. For a smallish number of iterations that won't matter, but for many iterations it is preferable to use an efficiently extensible .NET data structure, such as [System.Collections.Generic.List[object]] - see this answer.

Upvotes: 1

Daniel
Daniel

Reputation: 5114

There's probably a nicer way of doing it, but this is what I got.

$log = '.\tomcat.log'

# Get line number of last 'Server version' line
$lineNumber = Select-String -Path $log -Pattern '(?<=Server\ version:\s+).*' | 
    Select-Object -Last 1 -ExpandProperty LineNumber

# use switch with -wildcard to look for first occurrence of 'SEVERE'
# and take all lines until the next
$in = $false
$wantedLines = switch -wildcard (Get-Content $log | Select-Object -Skip $lineNumber ) {
    '*SEVERE*' {
        if ($in) { break }
        $in = $true
        $_
    }
    Default {
        if ($in) { $_ }
    }
}

# Check what we got
$wantedLines

Upvotes: 1

Related Questions