Cathy Breton
Cathy Breton

Reputation: 29

awk cumulative sum in on dimension

Good afternoon,

I would like to make a cumulative sum for each column and line in awk.

My in file is :

1 2 3 4
2 5 6 7
2 3 6 5
1 2 1 2

And I would like : per column

1 2 3 4
3 7 9 11
5 10 15 16
6 12 16 18
6 12 16 18

And I would like : per line

1 3 5 9 9
2 7 13 20 20

2 5 11 16 16
1 3 4 6 6

I did the sum per column as :

awk '{ for (i=1; i<=NF; ++i) sum[i] += $i}; END { for (i in sum) printf "%s ", sum[i]; printf "\n"; }' test.txt  #  sum 

And per line .

awk '
BEGIN {FS=OFS=" "}
{
sum=0; n=0
for(i=1;i<=NF;i++)
     {sum+=$i; ++n}
     print $0,"sum:"sum,"count:"n,"avg:"sum/n
}' test.txt

But I would like to print all the lines and columns.

Do you have an idea?

Upvotes: 2

Views: 3269

Answers (2)

karakfa
karakfa

Reputation: 67497

row sums with repeated last element

$ awk '{s=0; for(i=1;i<=NF;i++) $i=s+=$i; $i=s}1' file

1 3 6 10 10
2 7 13 20 20
2 5 11 16 16
1 3 4 6 6

$i=s sets the index value (now incremented to NF+1) to the sum and 1 prints the line with that extra field.

columns sums with repeated last row

$ awk '{for(i=1;i<=NF;i++) c[i]=$i+=c[i]}1; END{print}' file

1 2 3 4
3 7 9 11
5 10 15 16
6 12 16 18
6 12 16 18

END{print} repeats the last row

ps. your math seems to be wrong for the row sums

Upvotes: 2

kvantour
kvantour

Reputation: 26481

It looks like you have all the correct information available, all you are missing is the printout statements.

Is this what you are looking for?

accumulated sum of the columns:

 % cat foo
1 2 3 4
2 5 6 7
2 3 6 5
1 2 1 2
 % awk '{ for (i=1; i<=NF; ++i) {sum[i]+=$i; $i=sum[i] }; print $0}' foo
1 2 3 4
3 7 9 11
5 10 15 16
6 12 16 18

accumulated sum of the rows:

 % cat foo                                                              
1 2 3 4
2 5 6 7
2 3 6 5
1 2 1 2
 % awk '{ sum=0; for (i=1; i<=NF; ++i) {sum+=$i; $i=sum }; print $0}' foo                                                                                                                                                                      
1 3 6 10
2 7 13 20
2 5 11 16
1 3 4 6

Both these make use of the following :

  • each variable has value 0 by default (if used numerically)
  • I replace the field $i with what the sum value
  • I reprint the full line with print $0

Upvotes: 4

Related Questions