Branko
Branko

Reputation: 65

Change column order (different number of columns)

Original file looks like this:

stat.sn 15094 291 usf=630 ind=32615 on_2=ON-14-6003 spd=307 i_pow=150
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615 on_2=ON-14-6003 spd=307 i_pow=150
stat.sn 15094 276 usf=630 ind=32615 on_2=ON-14-6003 spd=307 i_pow=150

And I need it to be like this:

stat.sn 15094 291 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 276 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003

Simply said: if on_2 exists in a row, it should be the last value in a row. Order of the remaining values should stay the same.

If exists, on_2 is always 6th field - after ind and before spd. All fields have the same position in each row.

I guess that awk can make this happen, but can't figure out how to do it.

Upvotes: 1

Views: 192

Answers (5)

thanasisp
thanasisp

Reputation: 5975

so you want to move column $6 at the end whenever matches.

awk '$6 ~ /^on_2/ { t=$6; $6=$7; $7=$8; $8=t } 1' file

in case matching pattern could exist in any column and we want to print it always at the end.

awk '{ 
    for(i=1;i<NF-1;i++)
        if($i~/^on_2/) {
            t=$i;
            for(j=i;j<NF;j++)
                $j=$(j+1);
            $NF=t
        }
    } 1' file

We could omit this shift by setting t=$i; $i="" and printing $0 tbut then an extra FS is printed at the position of the moved column, breaking the alignment.


update: for how to use $(NF+1)=i; $i="" and preserve the alignment, check this answer.

Upvotes: 2

Ed Morton
Ed Morton

Reputation: 204731

$ awk '$6 ~ /^on_2=/{$(NF+1)=$6; $6=""; $0=$0; $1=$1}1' file
stat.sn 15094 291 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 276 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003

wrt why $0=$0; $1=$1 is necessary for formatting:

$6="" changes foo<blank>$6<blank>bar to foo<blank><blank>bar, i.e. it leaves the blank before and after where $6 is now null and the number of fields is unchanged and $5 is still foo and $7 is still bar.

Then using $0=$0 causes awk to re-split $0 into fields and so the multiple blanks between foo and bar get treated as a single fied separator and so though $5 is still foo, now $6 is bar and there's 1 less field on the line. $0 still contains 2 blanks between foo and bar though as this didn't change $0, it just changed how it was split into the individual field assignments.

So then using $1=$1 (or assigning to any field) causes awk to recompile $6 using the OFS value (a blank char) between fields so after this there's only 1 blank between foo and bar instead of 2.

So $0=$0 creates new values for $1->$NF and $1=$1 causes all FS between them to be replaced by OFS.

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133780

Could you please try following and let me know if this helps you(GNU sed).

sed -r 's/( on_2=[^ ]+)(.*)/\2\1/'  Input_file

Output will be as follows.

stat.sn 15094 291 usf=630 ind=32615  spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615  spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 276 usf=630 ind=32615  spd=307 i_pow=150 on_2=ON-14-6003

If you want the output as tab space delimited.

sed -r 's/( on_2=[^ ]+)(.*)/\2\1/' Input_file | column -t

Upvotes: 1

karakfa
karakfa

Reputation: 67567

another sed

$ sed -r 's/(\son_2=\S+)(.*)/\2\1/' file

stat.sn 15094 291 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003
stat.sn 15094 276 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003

Upvotes: 1

Akshay Hegde
Akshay Hegde

Reputation: 16997

Simply said: if on_2 exists in a row, it should be the last value in a row. Order of the remaining values should stay the same.

Some more way to get desired output, doesn't matter where on_2 exists, search , extract, nullify and make it as last column :

$ awk 'match($0,/on_2=[^ ]* /){s=substr($0,RSTART,RLENGTH);sub(s,"");$0=$0 FS s}1' infile
stat.sn 15094 291 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003 
stat.sn 15094 361 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 360 usf=630 ind=32658 spd=307 i_pow=150
stat.sn 15094 272 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003 
stat.sn 15094 276 usf=630 ind=32615 spd=307 i_pow=150 on_2=ON-14-6003 

Explanation:

awk 'match($0,/on_2=[^ ]* /){                # Search

         s=substr($0,RSTART,RLENGTH);        # if found extract it

         sub(s,"");                          # remove extracted part from row

         $0=$0 FS s                          # set extracted part at the end 

     }1                                      # print line/record
     ' infile
  • match($0,/on_2=[^ ]* /) - search for regexp (on2=[^ ]*) in record/line/row

    /on_2=[^ ]* /

    • on_2= matches the characters on_2= literally (case sensitive)
    • Match a single character not present in the list below [^ ]*
    • * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    • matches the character literally (case sensitive)
  • s=substr($0,RSTART,RLENGTH); if matched then extract string from record, and save it in variable s

  • sub(s,"") - search for string in s, in record/line, substitute with null

  • $0=$0 FS s - modify record/line/row

  • }1 - 1 at the end does default operation that is print current/record/row, print $0. To know how awk works try, awk '1' infile, which will print all records/lines, whereas awk '0' infile prints nothing. Any number other than zero is true, which triggers the default behavior.

Upvotes: 2

Related Questions