ZedBrannigan
ZedBrannigan

Reputation: 611

Parse and change the output of a system through Powershell

initially I have to state, that I have little to no experience with powershell so far. A previous system generates the wrong output for me. So I want to use PowerShell to change this. From the System I get an output looking like this:

TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^...^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')^|^N^|^LIKE^|^('6','7','8','9')^|^...^|^Y^|^NOT IN^|^('1','2','15','16','17')^|^Y^|^NOT IN^|^('18','19','20','21','22')

When you look at it, there is a starting part for each line (TEST1^|^9999^|^) followed by a1 to a-n tuples (example: Y^|^NOT IN^|^('1','2','3')^|^).

The way I want this to look like is here:

TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')
TEST1^|^9999^|^N^|^LIKE^|^('4','5','6','7')
TEST1^|^9999^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')
TEST2^|^9998^|^N^|^LIKE^|^('6','7','8','9')
TEST2^|^9998^|^Y^|^NOT IN^|^('1','2','15','16','17')
TEST2^|^9998^|^Y^|^NOT IN^|^('18','19','20','21','22')

So the tuples shall be printed out per line, with the starting part attached in front.

My solution approach is the AWK equivalent in Powershell, but to date I lack the understanding of how to tackle the issue of how to deal with an indetermined number of tuples and to repeat the starting block.

I thank you so much in advance for your help!

Upvotes: 0

Views: 70

Answers (3)

arco444
arco444

Reputation: 22861

The data looks quite regular so you could loop over it using | as the delimiter and counting the following cells in 3s:

$data = @"
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')^|^N^|^LIKE^|^('6','7','8','9')^|^Y^|^NOT IN^|^('1','2','15','16','17')^|^Y^|^NOT IN^|^('18','19','20','21','22')
"@

$data.split("`n") | % { 
    $ds = $_.split("|")
    $heading = "$($ds[0])|$($ds[1])"
    $j = 0

    for($i = 2; $i -lt $ds.length; $i += 1) {
        $line += "|$($ds[$i])" -replace "\^(\((?:'\d+',?)+\))\^?",'$1'
        $j += 1
        if($j -eq 3) { 
            write-host $heading$line
            $line = ""
            $j = 0
        }
    }
}

Upvotes: 1

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200453

I'd split the lines at ^|^ and recombine the fields of the resulting array in a loop. Something like this:

$sp = '^|^'

Get-Content 'C:\path\to\input.txt' | % {
  $a = $_ -split [regex]::Escape($sp)
  for ($i=2; $i -lt $a.length; $i+=3) {
    "{0}$sp{1}$sp{2}$sp{3}$sp{4}" -f $a[0,1,$i,($i+1),($i+2)]
  }
} | Set-Content 'C:\path\to\output.txt'

Upvotes: 3

vonPryz
vonPryz

Reputation: 24081

Parsing an arbitary length string record to row records is quite error prone. A simple solution would be processing the data row-by-row and creating output.

Here is a simple illustration how to process a single row. Processing the whole input file and writing output is left as trivial an exercise to the reader.

$s = "TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^Y^|^NOT IN^|^('8','9','10','11','12')"
$t = $s.split('\)', [StringSplitOptions]::RemoveEmptyEntries)
$testNum = ([regex]::match($t[0], "(?i)(test\d+\^\|\^\d+)")).value # Hunt for 1st colum values
$t[0] = $t[0] + ')' # Fix split char remove
for($i=1;$i -lt $t.Length; ++$i) { $t[$i] = $testNum + $t[$i] + ')' } # Add 1st colum and split char remove

$t
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')
TEST1^|^9999^|^N^|^LIKE^|^('4','5','6','7')
TEST1^|^9999^|^Y^|^NOT IN^|^('8','9','10','11','12')

Upvotes: 1

Related Questions