Zeus Von Venus
Zeus Von Venus

Reputation: 69

awk Removing some spaces from the end of a column

I am trying to remove 2 spaces from the end of the second column of my file using this command:

awk '{gsub(/[ \$]/, "", $2); print}' Myfile

The format of my input file looks like this (The characters and numbers here are just for showing the format):

   1AA       A      1   9.999   9.999   9.999
 111BB       B   1111   9.999   9.999   9.999
1111AABB  ABCD  11111   9.999   9.999   9.999

And I want the output like follows:

   1AA       A    1   9.999   9.999   9.999
 111BB       B 1111   9.999   9.999   9.999
1111AABB  ABCD11111   9.999   9.999   9.999

In practice, the third column shifts 2 spaces toward the second column.

But my code does nothing :(

Can anyone explain to me what's the problem with my code?

Thank you in advance!

Upvotes: 3

Views: 307

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626851

Here, inside awk gsub, [ \$] pattern matches one of two chars: a space or a $ char and is equal to [$ ].

Since your file is fixed-width, you can use a GNU awk command like

awk 'BEGIN { FIELDWIDTHS="10 6 8 8 8 8" } {gsub(/  $/,"",$2); print $1 $2 $3 $4 $5 $6}' Myfile > newMyfile

where / $/ matches two literal spaces at the end of the input ($ defines the end of string).

See more about FIELDWIDTHS in "4.6.1 Processing Fixed-Width Data" gawk manual.

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203557

Using any awk in any shell on every Unix box:

$ awk '{print substr($0,1,14) substr($0,17)}' file
   1AA       A    1   9.999   9.999   9.999
 111BB       B 1111   9.999   9.999   9.999
1111AABB  ABCD11111   9.999   9.999   9.999

or using GNU awk for FIELDWIDTHS (use 9999 or some other large number instead of * if you have an older gawk version that doesn't understand * as meaning "the rest of the line"):

$ awk -v FIELDWIDTHS='14 2 *' '{print $1 $3}' file
   1AA       A    1   9.999   9.999   9.999
 111BB       B 1111   9.999   9.999   9.999
1111AABB  ABCD11111   9.999   9.999   9.999

The problem with your code is you're trying to remove 2 spaces from $2 but your fields are space-separated so there are no spaces in any field, including $2, the spaces are between fields. Also your regexp /[ \$]/ means "blank or literal $ char", not "2 blanks before the end of string" as I believe you thought it meant.

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133518

With your shown samples, please try following rev + awk + rev combination code. Written and tested in GNU awk.

rev Input_file | 
awk '
  match($0,/^\S+(\s+\S+){3}/){
    val1=substr($0,RSTART,RLENGTH)
    val2=substr($0,RSTART+RLENGTH)
    sub(/^  /,"",val2)
    $0=val1 val2
  }
1
' | rev

With shown samples, output will be as follows:

   1AA       A    1   9.999   9.999   9.999
 111BB       B 1111   9.999   9.999   9.999
1111AABB  ABCD11111   9.999   9.999   9.999

Explanation: Following is the detailed level explanation for above code.

rev Input_file |                    ##using rev on Input_file to get output in reverse order.
awk '                               ##Sending rev output as standard input to awk.
  match($0,/^\S+(\s+\S+){3}/){      ##using match function of awk to match regex ^\S+(\s+\S+){3}
    val1=substr($0,RSTART,RLENGTH)  ##Creating val1 variable which has matched values in it.
    val2=substr($0,RSTART+RLENGTH)  ##Creating val2 variable which has rest of values(after matched value).
    sub(/^  /,"",val2)              ##Substituting starting 2 spaces(which are actually 2 spaces we need to remove between 2nd and 3rd field in question) with NULL in val2 here.
    $0=val1 val2                    ##Assigning val1 and val2 values to current line.
  }
1                                   ##printing current line here.
' | rev                             ##Sending awk program output to rev to print values in actual order.

Upvotes: 1

Related Questions