MilenKo
MilenKo

Reputation: 31

Extract log lines from a starting string to the first time stamp after with PowerShell

It is my first time that I am reaching back to you as I am stuck on something and been scratching my head for over a week now. It is worth saying that I just started with PowerShell a few months ago and I love using it for my scripts, but apparently my skills still need improving. I am unable to find a simple and elegant solution that would extract a log from clearly defined start line until the first empty line CF\LF or time stamp that follows.

I am attaching the log I am trying to extract the data from. To specify the problem and give some more details about the log lines - they can vary in number, the end line of each log can also vary and the time stamp is different for each log depending on the time the test was executed.

cls
# Grab the profile system path
$userProfilePath = $env:LOCALAPPDATA

# Define log path
$logPath = "$userProfilePath\DWIO\logs\IOClient.txt"

# Define the START log line matching string
# This includes the the tests that PASS and FAIL
$logStartLine = " TEST "

# Find all START log lines matching the string and grab their line number
$StartLine = (Get-Content $logPath | select-string $logStartLine)

#Get content from file
foreach ($start in $StartLine) {

    # Extract the date time stamp from every starting line
    $dateStamp =  ($start -split ' ')[0]

    #Regex pattern to compare two strings
    $pattern = "(.*)$dateStamp"

    #Perform the opperation
    $result = [regex]::Match($file,$pattern).Groups[1].Value

    Write-Host $result
}

The log format is like:

08-31 16:32:20 INFO  - [IOBridgeThread - mPerformAndComputeIntegrityCheck] - BridgeAsyncCall - mPerformAndComputeIntegrityCheck Result = TEST PASSED
Average Camera Temperature :40.11911°C
Module  0
    Nb Points: 50673 pts (>32500)
    Noise:
        AMD: 0.00449238 mm (<0.027)
        STD DEV: 0.006961088 mm
    Dead camera: false
Module  1
    Nb Points: 53809 pts (>40000)
    Noise:
        AMD: 0.0055302843 mm (<0.027)
        STD DEV: 0.00869096 mm
    Dead camera: false
Module consistency
    Weak module: false
    M0 to M1
        Distance: 0.007857603 mm (<0.015)
        Angle: 0.022567615 degrees (<0.07)
    Target
        Position: 0.009392071 mm (<5.0)
        Angle: 0.54686683 degrees (<5.0)
        Intensity: 120.35959

08-31 16:32:20 INFO  - [cIOScannerService RUNNING] - Scanner State is now Scan-Ready

The issue is that the line at the end of every log would be different as well as the log lines would differ so it is the only logical way to achieve the correct extraction is to match the first line which would always contain: " TEST " and then grab the log to the first timestamp appearance after or the empty line which also shows every time at the end of the log.

Just not sure how to achieve that and the code I have is returning no/empty matches, however if I echo $StartLine - it shows correctly the log starting lines.

Upvotes: 1

Views: 979

Answers (2)

The fourth bird
The fourth bird

Reputation: 163207

You can match the first line that starts with a date time like format and contains TEST in the line. Then capture in group 1 all the content that does not start with a date time like format.

(?m)^\d{2}-\d{2} \d{2}:\d{2}:\d{2}.*\bTEST\b.*\r?\n((?:(?!\d{2}-\d{2} \d{2}:\d{2}:\d{2}).*(?:\r?\n|$))*)

Explanation

  • (?m) Inline modifier for multiline
  • ^ Start of line
  • \d{2}-\d{2} \d{2}:\d{2}:\d{2}.*\bTEST\b.* Match a date time like pattern followed by TEST in the line
  • \r?\n Match a newline
  • ( Capture group 1
    • (?: Non capture group
      • (?!\d{2}-\d{2} \d{2}:\d{2}:\d{2}).*(?:\r?\n|$) If the line does not start with a date time like pattern, match the whole line followed by either a newline or the end of the line
    • )* Close non capture group and repeat 0+ times
  • ) Close group 1

See a regex101 demo and a .NET regex demo (click on the Table tab) and a powershell demo

You can use Get-Content -Raw to get the contents of a file as one string.

$textIOClient = Get-Content -Raw "$userProfilePath\DWIO\logs\IOClient.txt"

$pattern = "(?m)^\d{2}-\d{2} \d{2}:\d{2}:\d{2}.*\bTEST\b.*\r?\n((?:(?!\d{2}-\d{2} \d{2}:\d{2}:\d{2}).*(?:\r?\n|$))*)"
Select-String $pattern -input $textIOClient -AllMatches | Foreach-Object {$_.Matches} | Foreach-Object {$_.Groups[1].Value}

Upvotes: 2

FoxDeploy
FoxDeploy

Reputation: 13537

I found an approach I really loved in this answer elsewhere on the site:

PowerShell - Search String in text file and display until the next delimeter

Using that, I wrote a little code around it in the following to show you how to use the results:

$itemCount = 1
$Server = ""
$Data = @()
$Collection = @()
Switch(GC C:\temp\stackTestlog.txt){
    {[String]::IsNullOrEmpty($Server) -and !([String]::IsNullOrWhiteSpace($_))}{$Server = $_;Continue}
    {!([String]::IsNullOrEmpty($Server)) -and !([String]::IsNullOrEmpty($_))}{$Data+="`n$_";Continue}
    {[String]::IsNullOrEmpty($_)}{$Collection+=[PSCustomObject]@{Server=$Server;Data=$Data};Remove-Variable Server; $Data=@()}
}
If(!([String]::IsNullOrEmpty($Server))){$Collection+=[PSCustomObject]@{Server=$Server;Data=$Data};Remove-Variable Server; $Data=@()}

if(($null -eq $collection) -or ($Collection.Count -eq 0)){
    Write-Warning "Could not parse file"
}
else{
    Write-Output "Found $($collection.Count) members"
    ForEach($item in $Collection){
        #add additional code here if you need to do something with each parsed log entry
        Write-Output "Item # $itemCount $($item.Server) records"
        Write-Host $item.Data -ForegroundColor Cyan
        $itemCount++
    }
}

You can extend this in the line with a comment, and then remove the Write-output and Write-Host lines too.

Here's what it looks like in action.

Found 2 members
Item #1 08-31 16:32:20 INFO  - [IOBridgeThread - mPerformAndComputeIntegrityCheck] - BridgeAsyncCall - mPerformAndCompu
teIntegrityCheck Result = TEST PASSED records

Average Camera Temperature :40.11911°C
#abridged...
Item #2 blahblahblah

Upvotes: 0

Related Questions