ashish_k
ashish_k

Reputation: 1581

to insert pipes as delimiters between specific columns

I need to have pipe as delimiters between specific columns in a file.

Input:

AQ  92  18-09-2018 00:00:00  29  AR  18-09-2018 05:07:15 18-09-2018 08:06:56
BG  98  18-09-2018 00:00:00  29  AR  18-09-2018 05:07:15 18-09-2018 08:06:56

Expected Output:

AQ | 92 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56
BG | 98 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56

I tried using something like below using awk but not sure how to proceed further:

awk '{gsub(/ /,"|")}1;(NF==3)' file_name

Upvotes: 0

Views: 304

Answers (3)

oliv
oliv

Reputation: 13259

Another GNU awk (version >= 4.0) script:

awk 'BEGIN{FPAT="[A-Z0-9]{2}|([0-9]{2}-?){4} ([0-9]{2}:?){3}"; OFS=" | "}$1=$1' file
AQ | 92 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56
BG | 98 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56

This relies on FPAT (field pattern) that indicates how the field looks like.

In this case there are 2 patterns:

  • [A-Z0-9]{2} matches 2 digit or letter
  • ([0-9]{2}-?){4} ([0-9]{2}:?){3} matches the date-time string

The last statement $1=$1 tells awk to re-build the string according to the output field separator OFS.

This solution does not rely on the amount of spaces in between the fields.

Upvotes: 1

Cyrus
Cyrus

Reputation: 88829

With gawk:

awk 'BEGIN{FIELDWIDTHS="3 4 21 4 4 21 21"; OFS="|"} {print $1,$2,$3,$4,$5,$6," "$7}' file

Output:

AQ | 92 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56
BG | 98 | 18-09-2018 00:00:00 | 29 | AR | 18-09-2018 05:07:15 | 18-09-2018 08:06:56

FIELDWIDTHS variable contains a space separated list of numbers, each field is expected to have fixed width, and gawk splits up the record using the specified widths to $1, $2, $3 and so on.

OFS: The output field separator

Upvotes: 3

Barmar
Barmar

Reputation: 782130

Except for the last two fields, you have two spaces as delimiters between the fields. So you can set FS to " " to match this, and set OFS to " | " so they'll be converted on output. You only need to do something special with the last field, splitting it up and then turning it into two fields for output.

awk -F"  " -v OFS=" | " '{ 
    split($NF, a, " "); 
    $NF = a[1]" "a[2]; 
    $(NF+1) = a[3]" "a[4]; 
    print }'

Upvotes: 2

Related Questions