BobbyS
BobbyS

Reputation: 49

Editing a specific column of data in a text file with powershell

So I’ve had a a request to edit a csv file by replacing column values with a a set of unique numbers. Below is a sample of the original input file with a a header line followed by a couple of rows. Note that the rows have NO column headers.

HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|IE|USD|20200605|EUR200717||
DD|GZFD|IE|USD|20200605|EUR200717||

What I’m looking to do is change say the values in column 3 with a unique number.

So far I have the following …

$i=0

$txtin = Get-Content "C:\Temp\InFile.txt" | ForEach {"$($_.split('|'))"-replace $_[2],$i++} |Out-File C:\Temp\csvout.txt

… but this isn’t working as it removes the delimiter and adds numbers in the wrong places …

HH0###0000000SLH30400110000100000002000000202006060202006050011100
1D1D1 1G1Z1F1D1 1I1E1 1U1S1D1 12101210101610151 1E1U1R1210101711171 1 1
2D2D2 2G2Z2F2D2 2I2E2 2U2S2D2 22202220202620252 2E2U2R2220202721272 2 2

Ideally I want it to look like this, whereby the values of 'IE' have been replaced by '01' and '02' in each row ...

HH ### SLH304 01100001 2 20200606 20200605 011100

DD|GZFD|01|USD|20200605|EUR200717||

DD|GZFD|02|USD|20200605|EUR200717||

Any ideas on how to resolve would be much appreciated.

Upvotes: 1

Views: 819

Answers (1)

Theo
Theo

Reputation: 61028

I think by spreading this out to multiline code will make it easier:

$txtin = Get-Content 'C:\Temp\InFile.txt'
# loop through the lines, skipping the first line
for ($i = 1; $i -lt $txtin.Count; $i++){
    $parts = $txtin[$i].Split('|')      # or use regex -split '\|'
    if ($parts.Count -ge 3) {
        $parts[2] = '{0:00}' -f $i      # update the 3rd column
        $txtin[$i] = $parts -join '|'   # rejoin the parts with '|'
    }
}

$txtin | Out-File -FilePath 'C:\Temp\csvout.txt'

Output will be:

HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||

Updated to use the more robust check suggested by mklement0. This avoids errors when the line does not have at least three parts in it after the split

Upvotes: 2

Related Questions