mcatis
mcatis

Reputation: 1266

awk - do column-operation only on non empty fields

I would like to make some changes to a csv file. The file looks like this:

csv-file:

04.08.1994 22:47;3.3;29;;61.8;1;4.0;1.4;433;
04.08.1994 23:11;;27;93.0;60.6;2;2.0;6.5;133;
04.08.1994 23:48;3.1;;18.4;93.1;1;1.0;6.8;;

desired output:

04.08.1994 22:47;33;29;12345;61.8;20;4.0;1.4;433;
04.08.1994 23:11;12345;27;930;60.6;40;2.0;6.5;133;
04.08.1994 23:48;31;12345;184;93.1;20;1.0;6.8;12345;

Following should happen:

  1. Replace empty fields with a special value, say "12345".

  2. Do arithmetic operations of some columns (multiply with 10 or 20) only on non-empty fields.

My solution:

cat file | awk 'BEGIN {FS=OFS=";"} { for(i=1; i<=NF; i++) if($i ~ /^ *$/) $i = 12345 }; \
{ $2!=12345 && ($2=$2*10); $4!=12345 && ($4=$4*10); $6!=12345 && ($6=$6*20); print}'

So far it work's. Wonder if I have to write this if-condition before every column-operation? Or is there an easier way? Maybe should do all the operations already in the loop over each field?

Upvotes: 2

Views: 145

Answers (2)

James Brown
James Brown

Reputation: 37394

Another awk:

$ awk '
BEGIN {
    FS=OFS=";"
    some[2]=some[4]=10        # some columns 
    some[6]=20                # defined
}
{
    for(i=1;i<NF;i++)
        if($i=="")            # Replace empty fields 
            $i="12345"        # with a special value, say "12345".
        else if(i in some)    # on non-empty fields
            $i*=some[i]       # Do arithmetic operations of some columns
}1' file

Output:

04.08.1994 22:47;33;29;12345;61.8;20;4.0;1.4;433;
04.08.1994 23:11;12345;27;930;60.6;40;2.0;6.5;133;
04.08.1994 23:48;31;12345;184;93.1;20;1.0;6.8;12345;

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133458

Could you please try following (tested and written in GNU awk).

awk '
BEGIN{
  FS=OFS=";"
}
{
  found=""
  delete a
  for(i=1;i<NF;i++){
    if($i==""){
      a[i]
      $i="12345"
      found=1
    }
  }
}
found{
  $2=2 in a?"12345":$2*10
  $4=4 in a?"12345":$4*10
  $6=6 in a?"12345":$6*20
}
1
' Input_file

Output will be as follows.

04.08.1994 22:47;33;29;12345;61.8;20;4.0;1.4;433;
04.08.1994 23:11;12345;27;930;60.6;40;2.0;6.5;133;
04.08.1994 23:48;31;12345;184;93.1;20;1.0;6.8;12345;

Upvotes: 3

Related Questions