DylanB
DylanB

Reputation: 13

Building a script to convert XML file into JSON (with a specific format)

I am working on a script to convert an XML output file into a JSON file with specific formatting. Below is the XML output to be converted. I am having trouble accomplishing this task and am asking what changes to the script at the bottom could I make that would accomplish this?

Snippet from output.xml:

<System>
<tools>
<tool_PSList>
<process>
<process>
<pid>4188</pid>
<ownersid>"S-1-5-21-3556032652-3399990904-2739522603-1001"</ownersid>
<ownername>"Bochm"</ownername>
<ownerdomain>"DESKTOP-LS0HCH3"</ownerdomain>
-<file>
<name>"C:\ProgramFiles\WindowsApps\Microsoft.Messaging_1.10.22012.0_x86__8wekyb3d8bbwe\SkypeHost.exe"</name>
<size>44032</size>
<attr str="ARCHIVE" hex="0x20"/>
<cert_exists>"FALSE"</cert_exists>
<cert_verified>"FALSE"</cert_verified>
<cert_result>"TRUST_E_NOSIGNATURE"</cert_result>
<cert_comment>"The file is not signed"</cert_comment>

The format required for the JSON output is as show below (idealoutput.json with the timestamp replaced by the time the script runs):

{
"os":  "win",
"ts":  "timestamp here",
"value":  "C:\\Program Files\\WindowsApps\\Microsoft.Messaging_1.10.22012.0_x86__8wekyb3d8bbwe\\SkypeHost.exe\"; cert_comment=\"The file is not signed\"
}

Below is the current script I have produced in its current state it is rather broken unfortunately as this is my first week working with PowerShell, XML and JSON. I can include the other files mentioned in my script below but I am guessing there is an easier way than importing the format and then attempting to modify it with the output file so I haven't listed it here. If you'd like to see the other XML files mentioned below just let me know and I would be happy to oblige. Any insight or advice would be appreciated!

#Puts output of crowdresponse in $xml
$xml = [xml](Get-Content -Path C:\Users\Bochm\Downloads\CrowdResponse\output.xml)

#Puts the formating in $format
$format = [xml](Get-Content -Path C:\Users\Bochm\Downloads\CrowdResponse\format.xml)

#Filters the output of crowdresponse into results 
$result = $xml.system.tools.tool_PSList.processes.process.file |
          Select-Object name, cert_comment

#adds the mac address to result
$result += $xml.system | Select-Object macv4

$cert_valid = $xml.system.tools.tool_PSList.processes.process.file | Where-Object {
    $_.cert_signer -notmatch "Microsoft Corporation" -and
    $_.cert_signer -notmatch "Microsoft Windows" -and
    $_.cert_comment -match "The file is signed and the signature was verified"
} | Select-Object name, cert_comment  

$cert_invalid = $xml.system.tools.tool_PSList.processes.process.file |
    Where-Object { $_.cert_comment -match "The file is not signed" } |
    Select-Object name, cert_comment  

$node.innerXml += $cert_invalid.innerXml | Select-Object name, cert_comment

$result | ConvertTo-Json |
    Out-File "C:\Users\Bochm\Downloads\CrowdResponse\output.json"
$cert_valid | ConvertTo-Json |
    Out-File "C:\Users\Bochm\Downloads\CrowdResponse\certs_valid.json"
$cert_invalid | ConvertTo-Json |
    Out-File "C:\Users\Bochm\Downloads\CrowdResponse\certs_invalid.json"
$node | Select-Object os, ts, value, 'text' | ConvertTo-Json |
    Out-File "C:\Users\Bochm\Downloads\CrowdResponse\format.json"

Code for followup question:

$name = $xml.SelectNodes('//file') | ForEach-Object {
          $_.cert_signer -notmatch  "Microsoft Corporation" -and
          $_.cert_signer -notmatch "Microsoft Windows" -and
          $_.cert_comment -match "The file is signed and the signature was verified"
        } | Select-Object cert_comment

$name = $xml.system.tools.tool_PSList.processes.process.file |
        SelectNodes('//file') |
        ForEach-Object {
          $_.cert_signer -notmatch "Microsoft Corporation" -and
          $_.cert_signer -notmatch "Microsoft Windows" -and
          $_.cert_comment -match "The file is signed and the signature was verified"
        } | Select-Object name

Code for hopefully the final question:

$xml = Get-Content 'C:\Users\Bochm\Downloads\CrowdResponse\output.xml'

$name = $xml.system.tools.tool_PSList.processes.process.file| Where-Object {
   $_.cert_signer -notmatch  "Microsoft Corporation" -and
   $_.cert_signer -notmatch "Microsoft Windows" -and
   $_.cert_comment -match "The file is signed and the signature was verified"
 } | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | Select-Object name 
#$name    = $xml.SelectSingleNode('//name').'#text'
$name = $xml.system.tools.tool_PSList.processes.process.file | Where-Object {
   $_.cert_signer -notmatch  "Microsoft Corporation" -and
   $_.cert_signer -notmatch "Microsoft Windows" -and
   $_.cert_comment -match "The file is signed and the signature was verified"
 } | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | Select-Object cert_comment 

$name = $xml.system.tools.tool_PSList.processes.process.file | Where-Object {
   $_.cert_signer -notmatch  "Microsoft Corporation" -and
   $_.cert_signer -notmatch "Microsoft Windows" -and
   $_.cert_comment -match "The file is signed and the signature was verified"
 } | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | Select-Object name 

$name = $xml.system.tools.tool_PSList.processes.process.file| Where-Object {
   $_.cert_comment -match "The file is not signed"
 } | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | Select-Object cert_comment 

$name = $xml.system.tools.tool_PSList.processes.process.file| Where-Object {
   $_.cert_comment -match "The file is not signed"
 } | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | Select-Object name 
#$name    = $xml.SelectSingleNode('//name').'#text'

#$comment = $xml.SelectSingleNode('//cert_comment').'#text'

$prop = [ordered]@{
  'os'    = 'win'
  'ts'    = (Get-Date).DateTime
  'value' = "$name; cert_comment=$comment"
}

New-Object -Type PSCustomObject -Property $prop |  ConvertTo-Json | Out-File "C:\Users\Bochm\Downloads\CrowdResponse\end.json"

Upvotes: 1

Views: 3604

Answers (1)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200203

It's still not quite clear to me what the desired transformation is, since your XML sample isn't well-formed, and 2 of the 3 properties from the output JSON don't seem to appear in it.

Assuming you want to extract the content of the <name> and <cert_comment> nodes and enrich it with OS and timestamp you could do something like this:

[xml]$xml = Get-Content 'C:\path\to\output.xml'

$name    = $xml.SelectSingleNode('//name').'#text'
$comment = $xml.SelectSingleNode('//cert_comment').'#text'

$prop = [ordered]@{
  'os'    = 'win'
  'ts'    = (Get-Date).DateTime
  'value' = "$name; cert_comment=$comment"
}

New-Object -Type PSCustomObject -Property $prop | ConvertTo-Json

Edit:

$name = $xml.SelectNodes('//file') | ForEach-Object {
          $_.cert_signer -notmatch  "Microsoft Corporation" -and
          $_.cert_signer -notmatch "Microsoft Windows" -and
          $_.cert_comment -match "The file is signed and the signature was verified"
        } | Select-Object cert_comment

$name = $xml.system.tools.tool_PSList.processes.process.file |
        SelectNodes('//file') |
        ForEach-Object {
          $_.cert_signer -notmatch "Microsoft Corporation" -and
          $_.cert_signer -notmatch "Microsoft Windows" -and
          $_.cert_comment -match "The file is signed and the signature was verified"
        } | Select-Object name

These two code snippets won't work. For filtering a list of objects you need Where-Object, not ForEach-Object:

 ... | Where-Object {
   $_.cert_signer -notmatch  "Microsoft Corporation" -and
   $_.cert_signer -notmatch "Microsoft Windows" -and
   $_.cert_comment -match "The file is signed and the signature was verified"
 } | ...

Also, you don't want to assign a list of comments to $name but create a list of custom objects, so you need to do the object creation inside a ForEach-Object loop that you put after the filter:

... | ForEach-Object {
  $prop = [ordered]@{
    'os'    = 'win'
    'ts'    = (Get-Date).DateTime
    'value' = "{0}; cert_comment={1}" -f $_.name, $_.cert_comment
  }
  New-Object -Type PSCustomObject -Property $prop
} | ...

Upvotes: 2

Related Questions