courier
courier

Reputation: 3

How do I sum values of a column cumulatively with awk?

I have a sample.csv and want to sum it cumulatively by column, as below:

Input csv:                        Output csv:
01/01/2020, 0, 0, 2, 1            01/01/2020, 0, 0, 2, 1
18/04/2022, 7, 5, 1, 3            18/04/2022, 7, 5, 3, 4
01/05/2022, 8,21, 9, 4            01/05/2022,15,26,12, 8

I've tried

awk '{ for (i=1; i<=NF; ++i) {sum[i]+=$i; $i=sum[i] }; print $0}' sample.csv

But it returns this instead:

Input csv:                        Output csv:
01/01/2020, 0, 0, 2, 1            01/01/2020, 0, 0, 2, 1, 0, 0, 0, 0, 0
18/04/2022, 7, 5, 1, 3            18/04/2022, 7, 5, 1, 3, 0, 0, 0, 0, 0
01/05/2022, 8,21, 9, 4            01/05/2022, 8,21, 9, 4, 0, 0, 0, 0, 0

I'm at a loss as to how to resolve this.

Note: I am writing this in a bash script, not the terminal. And I'm not allowed to use any tools other than awk for this

Upvotes: 0

Views: 38

Answers (2)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2925

Tested and confirmed working on

  • gawk 5.1.1,
  • mawk 1.3.4,
  • mawk 1.9.9.6, and
  • macos nawk

——————————————————————————————————————————————

    # gawk profile, created Thu May 19 15:59:38 2022
    
    function fmt(_) {
        return +_<=_^(_<_) \
        ? "" : $_ = sprintf("%5.f",___[_]+=$_)
    }
    BEGIN { split(sprintf("%0*.f",((__=++_)+(++_*_)\
                                  )^++_,!_),___,"""")
            OFS = ", "
             FS = "[,][ ]*"
    } {       _ = NF                                    
        while(_!=__) { fmt(_--) } }_'

——————————————————————————————————————————————

01/01/2020,     0,     0,     2,     1
18/04/2022,     7,     5,     3,     4
01/05/2022,    15,    26,    12,     8

Upvotes: 0

William Pursell
William Pursell

Reputation: 212674

I can't duplicate your output. Other than whitespace mangling, this seems to do what you want:

awk '{ for (i=2; i<=NF; i+=1) {
    sum[i]+=$i; $(i)=sum[i]; 
}; print $0 }' FS=, OFS=, sample.csv

To get the whitespace you want, you could do:

 awk '{ 
     for (i=2; i<=NF; i+=1) {
         sum[i]+=$i; $(i)=sum[i]; 
     }
     printf "%s,%2d,%2d,%2d,%2d\n", $1, $2, $3, $4, $5
 }' FS=, sample.csv

If you don't know the number of columns, you could write that final printf in a loop.

Upvotes: 1

Related Questions