script0207
script0207

Reputation: 385

Fetch data from txt file and export it to CSV

I have a below data in a text file.

author   ; testname1
Revision ; 121
Date     ; 10/5/2018
Path     ; dev/test1
Message  ; notes: testdata1
author   ; testname2
Revision ; 1212
Date     ; 10/6/2018
Path     ; dev/test2
Message  ; notes: testdata2
author   ; testname3
Revision ; 1213
Date     ; 10/5/2018
Path     ; dev/test3
Message  ; notes: testdata3

I want to read this and export to CSV which look like below.

author,Revision,Date,Path,Message
testname1,121,10/5/2018,dev/test1,notes: testdata1
testname2,1212,10/6/2018,dev/test2,notes: testdata2
testname3,1213,10/5/2018,dev/test3,notes: testdata3

Any suggestions?

I have tried below code

$local:InputFilePath   = "path of file"
$local:OutFilePathCSV  = "path of csv file"
$local:CSVDelimiter    = ","
$local:OutDataList     = New-Object -TypeName System.Collections.Arraylist
$local:CurrentDataList = New-Object -TypeName System.Collections.Hashtable

Select-String -Path $InputFilePath -Pattern "^[\s]" -NotMatch | ForEach-Object {
    $local:CurrentLine  = ($_.Line).TrimEnd()
    $CurrentLine

    $OutDataList.Add($(New-Object -TypeName System.Management.Automation.PSObject -Property $CurrentDataList)) | Out-Null
    $CurrentDataList.Clear()
    if ($CurrentLine -match "^[\s]*([\w]*)[\s]+(.*)") {
        $CurrentDataList.Add($matches[1], $matches[2])
        $matches[1]
        $matches[2]

        #break
    }
}
$OutDataList |
    Sort-Object -Property Serial |
    Select-Object -Property author, Revision, Date, Action, Path |
    Export-Csv -Path $OutFilePathCSV -Delimiter $CSVDelimiter -NoTypeInformation

Upvotes: 0

Views: 371

Answers (2)

J. Doe
J. Doe

Reputation: 48

You're on the right path. However, there are a couple of ways to do this. This is the easiest, most straight-forward method. However, because you have all the different columns listed individually in the input file, you will need to preemptively know, at least,the number of columns, and the delimiter used (here it's the semicolon).

Sticking with your example text file, and as much as your code as possible, this is how you would do it.

$InputFilePath     = "path of file"
$OutFilePathCSV    = "path of csv file"
$CSVDelimiter      = ","
$OutDataList       = New-Object -TypeName System.Collections.Arraylist
$ColumnNumbers     = 5
$InputFileDelimter = ';'

$InputFileData = Select-String -Path $InputFilePath -Pattern "^[\s]" -NotMatch
for ($i = 0 ; $i -lt $InputFileData.count; $i += $ColumnNumbers) {
    $CurrentLine = New-Object PSObject
    for ($j = 0 ; $j -lt $ColumnNumbers; $j++) {
        if ($InputFileData[$i+$j].Line -match "^(.*)\s*$InputFileDelimter\s*(.*)\s*$") {
            $CurrentLine | Add-Member -MemberType NoteProperty -Name $matches[1] -Value $matches[2]
        }
    }
    $OutDataList.Add($CurrentLine)
}
$OutDataList |
    Select-Object -Property Author, Revision, Date, Action, Path |
    Export-Csv -Path $OutFilePathCSV -Delimiter $CSVDelimiter -NoTypeInformation

Tips/info:

  • With your regex "^[\s]*([\w]*)[\s]+(.*)"
    • ^[\s]* is redundant because you are already excluding lines that begin with white space in the select string.
    • ([\w]*) will not capture columns with spaces in the name (not necessary here, but for future reference).
    • (.*) will capture the delimiter as well.
  • With the regex "^(.*)\s*$InputFileDelimter\s*(.*)\s*$"
    • ^(.*)\s* will capture full column names without the white space before the delimiter. You can swap this for ^\w* if it will always be only one word.
    • $InputFileDelimter\s*(.*)\s*$ will capture the entire column value, without leading or trailing white space.

Notes:

  • $CSVDelimter isn't necessary because Export-CSV defaults to using a comma.
  • You don't have a column named Serial, so the sort doesn't do anything in your code.

Hope this helps! Good luck with PowerShell!

Edit:

From the comment on another answer below: Fetch data from txt file and export it to CSV

An alternative to a fixed read count is splitting at the first field with a RegEX using a postive look ahead $data = (Get-Content .\data.txt -Raw) -split "`n(?=author)"

This is actually a really good idea, and I like it. The only problem is that you have to be sure your first column is always listed first, per group of properties.

Upvotes: 0

Maximilian Burszley
Maximilian Burszley

Reputation: 19664

Found a super simple way to solve your problem by creating a hashtable and exporting it to a csv:

#requires -Version 3

$path = 'C:\Temp\data.txt'
$data = Get-Content -Path $path -ReadCount 5

$collection = foreach ($obj in $data)
{
    $out = [ordered]@{}
    foreach ($line in $obj.Split("`n"))
    {
        $a, $b = ($line -split ';').Trim()
        $out[$a] = $b
    }
    [pscustomobject]$out
}

$newPath = 'C:\Temp\file.csv'
$collection | Export-Csv -Path $newPath -Encoding ascii -NoTypeInformation -Force

This solution assumes your text document is well-formed.

Upvotes: 1

Related Questions