crafter
crafter

Reputation: 6297

gawk FIELDWIDTHS convert to CSV, but convert cents to dollars for some fields

I am converting a transaction detail file received from the bank from fixed width into a CSV file.

 gawk '$1=$1' FIELDWIDTHS='1 8 12 23 6 6 6 7 13 9 3 10 1 2 3 1 3 1 1 2 6 3 12 3 3 1 2 9 19 140  ' OFS=, $f  >> $csvfilename

Some of the fields are currency fields and listed in cents. For example, the second last field will have a value "1234567". However, the value I want output is "12345.67".

I am looking for how I can do that conversion. Something like

FIELDWIDTHS='1 8 12 23 6 6 6 7.2 13 '

As an example. the following input is received

1000000000000027302080000000541282******876421111821111821111800648300000000021687000000000PU 831712 DR90 020N000000000323132106663755
0000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

The resulting CSV file is

1,00000000,000002730208,0000000448387******6098,211115,211116,211116,0120456,0000000051580,000000000,PU ,754045 , ,DR,90 ,0,20N,0,0,00,000003,211,31912995587 , , 0,0,00,000000000,0000000000000000000,0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

The 8th field has a value "0120456". I would like the value in the resulting comma delimited file to read "01204.56"

Upvotes: -3

Views: 208

Answers (2)

Ed Morton
Ed Morton

Reputation: 204456

It sounds like this is what you're trying to do:

$ cat file
foo12345bar

$ awk -v FIELDWIDTHS='3 5 3' -v OFS=',' '{sub(/..$/,".&",$2)} 1' file
foo,123.45,bar

Upvotes: 1

Nic3500
Nic3500

Reputation: 8621

Here is a solution that I think you can use, or adapt.

Bash script:

#!/bin/bash

line="1000000000000027302080000000541282******876421111821111821111800648300000000021687000000000PU 831712     DR90 020N000000000323132106663755 0000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"

echo "$line" | awk -f so.awk

so.awk

BEGIN {
    FIELDWIDTHS = "1 8 12 23 6 6 6 7 13 9 3 10 1 2 3 1 3 1 1 2 6 3 12 3 3 1 2 9 19 140"
}

{
    # Print first 7 fields
    for (i=1; i<=7; i++) {
        printf "%s,", $i
    }

    # Print 8th field
    firstXchars = substr($8, 0, length($8)-2)
    lasttwochars = substr($8, length($8)-1)
    printf "%s.%s,", firstXchars, lasttwochars

    # Print the rest of fields, except the last one (I do not want the , for the last one)
    for (j=9; j<=NF-1; j++) {
        printf "%s,", $j
    }

    # Print the last field
    print $NF
}

What it does

  • print the first 7 fields separated by , , no changes
  • for the 8th field, extract the characters except the last two
  • for the 8th field, extract the last two characters
  • print field 8 with (except last two).(last two),
  • print all remaining fields but the last one separated by ,
  • print the last field

Output

$ ./so.bash 
1,00000000,000002730208,0000000541282******8764,211118,211118,211118,00648.30,0000000021687,000000000,PU ,831712    , ,DR,90 ,0,20N,0,0,00,000003,231,32106663755 ,000,000,0,00,000000000,0000000000000000000,000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
                                                                     ^^^^^^^^

Note ^^^^^^^^ is not part of the output, I just highlight field 8

With the method highlighted here for managing field 8, you can adapt the script to fit your actual data and output requirements.

Upvotes: 1

Related Questions