Adil Hindistan
Adil Hindistan

Reputation: 6605

Parsing XML-Like log file

I have a log file which records events as follows. I would like convert each event into a PSCustomobject. It kinda looks like XML but casting xml to the Get-Content for the file gives me an error:

Cannot convert value "System.Object[]" to type "System.Xml.XmlDocument". Error: "This document already has a 'DocumentElement' node."

<event date='Jan 06 01:46:16' severity='4' hostName='ABC' source='CSMFAgentPolicyManager' module='smfagent.dll' process='AeXNSAgent.exe' pid='1580' thread='1940' tickCount='306700046' >
  <![CDATA[Setting wakeup time to 3600000 ms (Invalid DateTime) for policy: DefaultWakeup]]>
</event>

Here is the piece of code I have so far

   <#
.EXAMPLE    
source    : MaintenanceWindowMgr
process   : AeXNSAgent.exe
thread    : 8500
hostName  : ABC
severity  : 4
tickCount : 717008140
date      : Jan 10 19:45:00
module    : PatchMgmtAgents.dll
pid       : 11984
CData     : isAbidingByMaintenanceWindows() - yes
#>
$logpath = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'
$hash=[ordered]@{};
$log = get-content $logpath | % {

    ## handle Event start
    ## sample: <event date='Jan 10 18:45:00' severity='4' hostName='ABC' source='MaintenanceWindowMgr' module='PatchMgmtAgents.dll' process='AeXNSAgent.exe' pid='11984' thread='8500' tickCount='713408140' >
    if ($_ -match '^<event') {

        if ($hash) {                
            ## Convert the hastable to PSCustomObject before clearing it
            New-Object PSObject -Property $hash
            $hash.Clear()
        }

        $line = $_ -replace '<event ' -replace ' >' -split "'\s" -replace "'"               
        $line | % { 

            $name,$value=$_ -split '='                
            $hash.$name=$value
        }        
    }

    ## handle CData
    ## Sample: <![CDATA[Schedule Software Update Application Task ({A1939DC8-DA4A-4E46-9629-0500C2383ECA}) triggered at 2014-01-10 18:50:00 -5:00]]>
    if ($_ -match '<!') {
        $hash.'CData' = ($_ -replace '<!\[CDATA\[' -replace '\]\]>$').ToString().Trim()
    }
}
  $log 

Unfortunately, the object is not in the form I would want it.

$log|gm


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType Definition                    
----        ---------- ----------                    
Equals      Method     bool Equals(System.Object obj)
GetHashCode Method     int GetHashCode()             
GetType     Method     type GetType()                
ToString    Method     string ToString()   

When I try to collect all the objects from the output, I am losing the NoteProperties that are generated when I convert the hash to PSCustomObject

   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                                     
----        ----------   ----------                                                                                                                                     
Equals      Method       bool Equals(System.Object obj)                                                                                                                 
GetHashCode Method       int GetHashCode()                                                                                                                              
GetType     Method       type GetType()                                                                                                                                 
ToString    Method       string ToString()                                                                                                                              
Equals      Method       bool Equals(System.Object obj)                                                                                                                 
GetHashCode Method       int GetHashCode()                                                                                                                              
GetType     Method       type GetType()                                                                                                                                 
ToString    Method       string ToString()                                                                                                                              
CData       NoteProperty System.String CData=isAbidingByMaintenanceWindows() - yes                                                                                      
date        NoteProperty System.String date=Jan 10 18:45:00                                                                                                             
hostName    NoteProperty System.String hostName=ABC                                                                                                             
module      NoteProperty System.String module=PatchMgmtAgents.dll                                                                                                       
pid         NoteProperty System.String pid=11984                                                                                                                        
process     NoteProperty System.String process=AeXNSAgent.exe                                                                                                           
severity    NoteProperty System.String severity=4                                                                                                                       
source      NoteProperty System.String source=MaintenanceWindowMgr                                                                                                      
thread      NoteProperty System.String thread=8500                                                                                                                      
tickCount   NoteProperty System.String tickCount=713408140 

What am I missing here?

Upvotes: 0

Views: 12609

Answers (3)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200203

XML files must have a single root (or documentElement) node. Since your log file seems to contain multiple <event> tags without a common root element you can add the missing documentElement like this:

$logpath  = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'
[xml]$log = "<logroot>$(Get-Content $logpath)</logroot>"

After that you can process your log with the usual methods, e.g.:

$fmt = 'MMM dd HH:mm:ss'

$log.SelectNodes('//event') |
  select @{n='date';e={[DateTime]::ParseExact($_.date, $fmt, $null)}},
         severity, hostname, @{n='message';e={$_.'#cdata-section'}}

If you prefer custom objects you can easily create them like this:

$fmt = 'MMM dd HH:mm:ss'

$log.SelectNodes('//event') | % {
  New-Object -Type PSObject -Property @{
    'Date'     = [DateTime]::ParseExact($_.date, $fmt, $null)
    'Severity' = $_.severity
    'Hostname' = $_.hostname
    'Message'  = $_.'#cdata-section'
  }
}

Upvotes: 4

mjolinor
mjolinor

Reputation: 68243

Using your split method:

$hash = [ordered]@{}
$regex = '^<event (.+) >$'
$lines = (gc $file) -match $regex -replace $regex,'$1'
foreach ($line in $lines)
 {
         $hash.Clear() 
         $line -split "'\s" -replace "'" |
         foreach {
                   $name,$value=$_ -split '='                
                   $hash.$name=$value
                 }

        [PSCustomObject]$hash 
} 

Upvotes: 1

Adil Hindistan
Adil Hindistan

Reputation: 6605

I initially thought my problem was with the original hash not being sorted but later figured out where the actual problem was. The code below caused an initial PSCustomObject without any NoteProperty to be created:

  if ($hash) { .... }

Even a just initialized hash satisfied that as shown below:

PS H:\> $myhash=[ordered]@{}
PS H:\> if ($myhash) {"yay"}
yay

so to fix it, I simply changed the check

# CData is the last record, if hash has it, it's ready to convert to PSCustomObject
if ($hash.CData) { ... }  

Here is the updated code:

   $hash=[ordered]@{}        
    $logpath = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'       
    Get-Content $logpath | % {

        ## handle Event start            
        if ($_ -match '^<event') {       
            # CData is the last record, if hash has it, it's ready to convert to PSCustomObject
            if ($hash.CData) {                        
                ## Convert the hastable to PSCustomObject before clearing it
                [PSCustomObject]$hash                
                $hash.Clear()
            }

            ## sample: <event date='Jan 10 18:45:00' severity='4' hostName='ABC' source='MaintenanceWindowMgr' module='PatchMgmtAgents.dll' process='AeXNSAgent.exe' pid='11984' thread='8500' tickCount='713408140' >
            $line = $_ -replace '<event ' -replace ' >' -split "'\s" -replace "'"               
            $line | % { 
                        $name,$value=$_ -split '='                               
                            $hash.$name=$value                        
            }        
        }

        ## handle CData
        ## Sample: <![CDATA[Schedule Software Update Application Task ({A1939DC8-DA4A-4E46-9629-0500C2383ECA}) triggered at 2014-01-10 18:50:00 -5:00]]>
        if ($_ -match '<!') {
            $hash.'CData' = ($_ -replace '<!\[CDATA\[' -replace '\]\]>$').ToString().Trim()
        }
    }  

Thanks @mjolinor for helpful comments!

Upvotes: 0

Related Questions