manas
manas

Reputation: 471

summing the values of a column row wise

experts i have a input file

>   >   >
0.000   5.064   1.0
0.100   9.064   5.0
0.200   5.064   6.3
0.300   9.064   4.2
0.400   5.064   2.3
>   >   >
0.000   4.977   1.2 
0.100   6.977   1.8
0.200   4.977   1.2
0.300   6.977   1.8
0.400   4.977   9.6

I want to sum the third column values row wise and output should be

0.000   2.2
0.100   6.8
0.200   7.5
0.300   6.0
0.400   11.9

i tried

awk '{sum=0; for(i=1; i<=NF; i++) sum += $i; print sum}'  input_file > out_file

But problem is that i cannot arrange file as i expect.I need some solution for the same.Thanks in advance.

Upvotes: 3

Views: 85

Answers (2)

RavinderSingh13
RavinderSingh13

Reputation: 133538

With shown samples and With any awk using sort + awk solution, try following. Since this is sorting Input_file first and running awk program, so it will exit ASAP 1st occurrence of > comes in file, which could save some time if Input_file is huge.

LC_ALL='C' sort -k1 Input_file | 
awk '
/^>/{ exit }
prev!=$1{
  printf("%s %.01f\n",prev,sum)
  sum=0
}
{
  sum+=$3
  prev=$1
}
END{
  printf("%s %.01f\n",prev,sum)
}
'

Output will be as follows:

0.000 2.2
0.100 6.8
0.200 7.5
0.300 6.0
0.400 11.9

Explanation: Simple explanation would be: first sorting Input_file with 1st field(in which lines which are starting with > will come at last with shown samples). Then passing its standard output to awk program as standard input. In awk program checking every line if current line's first field is different from its previous value then print sum and previous field value. In each line keep adding 3rd field value to sum and keep current 1st field value to prev variable.



EDIT: In case your Input_file is tab separated then try following code.

LC_ALL='C' sort -t'\t' -k1 file2 | 
awk '
BEGIN{ FS=OFS="\t" }
/^>/ { exit        }
prev!=$1{
  print prev,sum
  sum=0
}
{
  sum+=$3
  prev=$1
}
END{
  print prev,sum
}'

Upvotes: 3

Cyrus
Cyrus

Reputation: 88654

With GNU awk and PROCINFO["sorted_in"] to sort array n:

awk '$1!=">"{n[$1]=n[$1]+$3}
     END{
       PROCINFO["sorted_in"] = "@ind_num_asc";
       for (i in n){
         printf("%s %.1f\n", i, n[i])
       }
     }' file

Output:

0.000 2.2
0.100 6.8
0.200 7.5
0.300 6.0
0.400 11.9

Upvotes: 3

Related Questions