Reputation: 3
I have a sample.csv and want to sum it cumulatively by column, as below:
Input csv: Output csv:
01/01/2020, 0, 0, 2, 1 01/01/2020, 0, 0, 2, 1
18/04/2022, 7, 5, 1, 3 18/04/2022, 7, 5, 3, 4
01/05/2022, 8,21, 9, 4 01/05/2022,15,26,12, 8
I've tried
awk '{ for (i=1; i<=NF; ++i) {sum[i]+=$i; $i=sum[i] }; print $0}' sample.csv
But it returns this instead:
Input csv: Output csv:
01/01/2020, 0, 0, 2, 1 01/01/2020, 0, 0, 2, 1, 0, 0, 0, 0, 0
18/04/2022, 7, 5, 1, 3 18/04/2022, 7, 5, 1, 3, 0, 0, 0, 0, 0
01/05/2022, 8,21, 9, 4 01/05/2022, 8,21, 9, 4, 0, 0, 0, 0, 0
I'm at a loss as to how to resolve this.
Note: I am writing this in a bash script, not the terminal. And I'm not allowed to use any tools other than awk for this
Upvotes: 0
Views: 38
Reputation: 2925
Tested and confirmed working on
gawk 5.1.1
,mawk 1.3.4
,mawk 1.9.9.6
, andmacos nawk
——————————————————————————————————————————————
# gawk profile, created Thu May 19 15:59:38 2022
function fmt(_) {
return +_<=_^(_<_) \
? "" : $_ = sprintf("%5.f",___[_]+=$_)
}
BEGIN { split(sprintf("%0*.f",((__=++_)+(++_*_)\
)^++_,!_),___,"""")
OFS = ", "
FS = "[,][ ]*"
} { _ = NF
while(_!=__) { fmt(_--) } }_'
——————————————————————————————————————————————
01/01/2020, 0, 0, 2, 1
18/04/2022, 7, 5, 3, 4
01/05/2022, 15, 26, 12, 8
Upvotes: 0
Reputation: 212674
I can't duplicate your output. Other than whitespace mangling, this seems to do what you want:
awk '{ for (i=2; i<=NF; i+=1) {
sum[i]+=$i; $(i)=sum[i];
}; print $0 }' FS=, OFS=, sample.csv
To get the whitespace you want, you could do:
awk '{
for (i=2; i<=NF; i+=1) {
sum[i]+=$i; $(i)=sum[i];
}
printf "%s,%2d,%2d,%2d,%2d\n", $1, $2, $3, $4, $5
}' FS=, sample.csv
If you don't know the number of columns, you could write that final printf in a loop.
Upvotes: 1