yardyy
yardyy

Reputation: 89

Parse CSV file in an object

We have some pretty large log files (3-8Gb) that are delimited with spaces, with 64 headers. I need to search these and pull out the search term, but only need the data from 5 of the 64 headers.

I came across this presentation by Tobias Weltner. After viewing it, I have some snippets of code but seem to be stuck in actually getting any results.

Essentially I need to search on 5 headers from a much larger file. The code that i have so far is :

$Search = "J89HD"
$logfile = "C:\Logs\CP8945KGT.log"

ForEach-Object {

    $line = $_
    $infos = $line -split " "

    $hashtable = [Ordered]@{}
        $hashtable.date = $infos[0] 
        $hashtable.time = $infos[1]
        $hashtable.Index = $infos[2]
        $hashtable.source = $infos[3]
        $hashtable.destination = $infos[-1]

    New-Object -TypeName psobject -Property $hashtable

    Get-Content -Path $hashtable |
        Where-Object { $_ -match "$Search" } |
        Select-Object -Last 20 |
        Out-GridView
}

The error message that I get is:

Get-Content: Cannot find path 'C:\System.Collections.Specialized.OrderedDictionary' because it does not exist.
At C:\scripts\testing01.ps1:17 char:1
+ Get-Content -Path  $hashtable | Where-Object {$_ -match "$Search"} |  ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\System.Colle...deredDictionary:String) [Get-Content], ItemNotFoundException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand

Upvotes: 2

Views: 204

Answers (1)

sodawillow
sodawillow

Reputation: 13176

Here is a snippet based on what you want to do and pieces of your code. This is untested as we don't have sample input data to test with.

$Search = "J89HD"
$logfile = "C:\Logs\CP8945KGT.log"

#load file
Get-Content -Path $logfile |

    #apply search filter on each line
    Where-Object { $_ -match $Search } |

    #keep only last 20 lines
    Select-Object -Last 20 |

    #for each line
    ForEach-Object {

        #store the line in a variable
        $line = $_

        #split the line on spaces to get an array
        $infos = $line -split " "

        #build a hashtable with properties holding specific cells of the array
        $hashtable = [Ordered]@{
            date = $infos[0] 
            time = $infos[1]
            Index = $infos[2]
            source = $infos[3]
            destination = $infos[-1]
        }

        #build a custom object with the properties in the hastable
        New-Object -TypeName psobject -Property $hashtable

    #display the objects in a window
    } | Out-GridView

I'll try to explain what is wrong with your syntax:

  1. the ForEach-Object block is meant to be processed for each element in the pipeline, so you have to put it in the pipeline

  2. the closing } of the hashtable declaration should be after the properties

  3. Get-Content -Path expects a file path and you're giving it a hashtable (the error you get is due to this)

  4. Get-Content should not be in the ForEach-Object block since you don't want to load the file more than once; it's the beginning of your pipeline.

Upvotes: 1

Related Questions