Bill
Bill

Reputation: 13

Powershell: Parse a structured text file and save to .CSV

I'm very new to Powershell. Only have been using it for about 2 weeks.

I have a file that is structured like this:

Service name: WSDL 
Service ID: 14234321885 
Service resolution path: /gman/wsdlUpdte 
Serivce endpoints: 
-------------------------------------------------------------------------------- 
Service name: DataService 
Service ID: 419434324305 
Service resolution path: /widgetDate_serv/WidgetDateServ 
Serivce endpoints:  
http://servername.company.com:1012/widgetDate_serv/WidgetDateServ
-------------------------------------------------------------------------------- 
Service name: SearchService 
Service ID: 393234543546 
Service resolution path: /ProxyServices/SearchService 
Serivce endpoints:  
http://servername.company.com:13010/Services/SearchService_5_0
http://servername2.company.com:13010/Services/SearchService_5_0
-------------------------------------------------------------------------------- 
Service name: Worker 
Service ID: 14187898547 
Service resolution path: /ProxyServices/Worker 
Serivce endpoints:  
http://servername.company.com:131009/Services/Worker/v9
--------------------------------------------------------------------------------

I'd like to parse the file and have Service name, Service ID, Service Resolution Path and Service Endpoints (which sometimes contain multiple or no values) in individual columms (CSV).

Beyond using Get-Content and looping through the file, I have no idea even where to start.

Any help will be appreciated. Thanks

Upvotes: 1

Views: 38643

Answers (4)

Esperento57
Esperento57

Reputation: 17492

with PowerShell 5 you can use the fabulous command 'convertfrom-string'

$template=@'
Service name: {ServiceName*:SearchService} 
Service ID: {serviceID:393234543546} 
Service resolution path: {ServicePath:/ProxyServices/SearchService} 
Serivce endpoints:
http://{ServiceEP*:servername.company.com:13010/Services/SearchService_5_0}
http://{ServiceEP*:servername2.tcompany.tcom:13011/testServices/SearchService_45_0}
--------------------------------------------------------------------------------
Service name: {ServiceName*:Worker} 
Service ID: {serviceID:14187898547} 
Service resolution path: {ServicePath:/ProxyServices/Worker} 
Serivce endpoints:
http://{ServiceEP*:servername3.company.com:13010/Services/SearchService}
--------------------------------------------------------------------------------
Service name: {ServiceName*:WSDL} 
Service ID: {serviceID:14234321885} 
Service resolution path: {ServicePath:/gman/wsdlUpdte} 
Serivce endpoints:
http://{ServiceEP*:servername4.company.com:13010/Services/SearchService_5_0}
--------------------------------------------------------------------------------
'@


#explode file with template
$listexploded=Get-Content -Path "c:\temp\file1.txt" | ConvertFrom-String -TemplateContent $template

#export csv 
$listexploded |select *, @{N="ServiceEP";E={$_.ServiceEP.Value -join ","}} -ExcludeProperty ServiceEP | Export-Csv -Path "C:\temp\res.csv" -NoTypeInformation

Upvotes: 3

Shay Levy
Shay Levy

Reputation: 126922

Give this a try:

  1. Read the file content as one string
  2. Split it by 81 hyphens
  3. Split each splited item on the colon char and take the last array item
  4. Create new object for each item

    $pattern = '-'*81  
    $content = Get-Content D:\Scripts\Temp\p.txt | Out-String
    $content.Split($pattern,[System.StringSplitOptions]::RemoveEmptyEntries) | Where-Object {$_ -match '\S'} | ForEach-Object {
    
    $item = $_ -split "\s+`n" | Where-Object {$_}
    
        New-Object PSobject -Property @{
            Name=$item[0].Split(':')[-1].Trim()
            Id = $item[1].Split(':')[-1].Trim()
            ResolutionPath=$item[2].Split(':')[-1].Trim()
            Endpoints=$item[4..($item.Count)]
        } | Select-Object Name,Id,ResolutionPath,Endpoints
    }
    

Upvotes: 1

JPBlanc
JPBlanc

Reputation: 72680

Here is a general way parsing files with records and records of records (and so on), it use the powerfull PowerShell switch instruction with regular expressions and the begin(), Process(), end() function template.

Load it, debug it, correct it ...

function Parse-Text
{
  [CmdletBinding()]
  Param
  (
    [Parameter(mandatory=$true,ValueFromPipeline=$true)]
    [string]$ficIn,
    [Parameter(mandatory=$true,ValueFromPipeline=$false)]
    [string]$ficOut
  )

  begin
  {
    $svcNumber = 0
    $urlnum = 0
    $Service = @()
    $Service += @{}
  } 

  Process 
  {
    switch -regex -file $ficIn
    {
      # End of a service
      "^-+"
      {
        $svcNumber +=1
        $urlnum = 0
        $Service += @{}
      }
      # URL, n ones can exist
      "(http://.+)" 
      {
        $urlnum += 1
        $url = $matches[1]
        $Service[$svcNumber]["Url$urlnum"] = $url
      }
      # Fields
      "(.+) (.+): (.+)" 
      {
        $name,$value = $matches[2,3]
        $Service[$svcNumber][$name] = $value
      }
    }
  }

  end 
  {
    #$service[3..0] | % {New-Object -Property $_ -TypeName psobject} | Export-Csv c:\Temp\ws.csv
    # Get all the services except the last one (empty -> the file2Parse is teerminated by ----...----)
    $tmp = $service[0..($service.count-2)] | Sort-Object @{Expression={$_.keys.count };Descending=$true}
    $tmp | % {New-Object -Property $_ -TypeName psobject} | Export-Csv $ficOut
  }
}


Clear-Host
Parse-Text -ficIn "c:\Développements\Pgdvlp_Powershell\Apprentissage\data\Text2Parse.txt" -ficOut "c:\Temp\ws.csc"
cat "c:\Temp\ws.csv"

Upvotes: 0

user189198
user189198

Reputation:

Try this:

Get-Content | ? { $_ -match ': ' } | % { $_ -split ': ' } | Export-Csv Test.csv;

Basically it boils down to:

  1. Get all text content as an array
  2. Filter for lines that contain ': '
  3. For each line left over, split it on ': '
  4. Export object arrays to a CSV file named test.csv

Hope this points you in the right direction.

Note: Code is untested.

Upvotes: 1

Related Questions