var
var

Reputation: 37

sum up the column and place beside the column

I have a file like

2291,   382718.00
2291,    19338.00
2291,     9073.00
2292,     4707.00
2293,   495847.00
2293,   157310.00
2293,    63582.00
2293,   100059.00
2293,     2843.00
2293,    58597.00
2293,    14836.00
2293,    24204.00

I want to sum the second column and place the result beside the last record whenever the first column value changes.

id,             amount,              total
2291,   382718.00 ,   
2291,    19338.00 ,   
2291,     9073.00 ,   411129.00 
2292,     4707.00 ,     4707.00
2293,   495847.00 ,   
2293,   157310.00 ,   
2293,    63582.00 ,   
2293,   100059.00 ,   
2293,     2843.00 ,   
2293,    58597.00 ,   
2293,    14836.00 ,   
2293,    24204.00 ,   917278.00

My program is adding all the second column values and giving the result at the end. How to break the totals depending on the first column values.

   awk -F, '{ 
      a[NR] = $0
      sum  += $2
    }
    END {
      for (x = 1; x <= NR-1; x++) {
        printf"%s\n", a[x]
      }
      printf"%s %s\n", a[NR],sum
    }'

Upvotes: 1

Views: 66

Answers (2)

RavinderSingh13
RavinderSingh13

Reputation: 133538

1st solution: Could you please try following.

awk '
BEGIN{
  FS=", +"
  OFS=", "
}
prev!="" && prev!=$1{
  print prev_line,sum
  sum=""
}
prev==$1{
  print prev_line
}
{
  sum+=$2
  prev=$1
  prev_line=$0
}
END{
  if(sum){
    print prev_line,sum
  }
}
'  Input_file

Explanation: Adding detailed explanation for above code here:

awk '                       ##Starting awk program from here.
BEGIN{                      ##Starting BEGIN section from here.
  FS=", +"                  ##Setting FS(field separator) as comma and space.
  OFS=", "                  ##Setting OFS(output field separator) as comma and space.
}                           ##Closing BLOCK for BEGIN section of this program.
prev!="" && prev!=$1{       ##Checking condition if prev is NOT NULL and prev is NOT equal to $1 of current line.
  print prev_line,sum       ##Printing prev_line and sum variable here.
  sum=""                    ##Nullifying variable sum here.
}                           ##Closing above condition BLOCK here.
prev==$1{                   ##Checking condition if variable prev is equal to $1 then do following.
  print prev_line           ##Printing variable prev_line here.
}                           ##Closing BLOCK for above condition.
{
  sum+=$2                   ##Creating variable sum and keep adding $2 value in its value.
  prev=$1                   ##Setting $1 to variable prev here.
  prev_line=$0              ##Setting current line value to variable prev_line here.
}
END{                        ##END BLOCK for this section is starting here.
  if(sum){                  ##Checking if variable sum is NOT NULL then do following.
    print prev_line,sum     ##Printing prev_line and sum variable here.
  }                         ##Closing BLOCK for above condition here.
}                           ##Closing END BLOCK of this program here.
'  Input_file               ##Mentioning Input_file name here.


2nd solution: In case you have any header value then do following.

awk '
BEGIN{
  FS=", +"
  OFS=", "
}
FNR==1{
  print
  next
}
prev!="" && prev!=$1{
  print prev_line,sum
  sum=""
}
prev==$1{
  print prev_line
}
{
  sum+=$2
  prev=$1
  prev_line=$0
}
END{
  if(sum){
    print prev_line,sum
  }
}
'  Input_file

Upvotes: 2

James Brown
James Brown

Reputation: 37404

Another awk, using printf for pretty-print:

$ awk '
NR>1 {                          # no output for the first record
    printf "%s,",q              # output the previous record
    if(p==$1)                   # if $1 remains the same
        print ""                # end line
    else {
        printf "%12.2f\n",s     # otherwise print the sum
        s=""                    # reset the sum
    }
}
{                               # storing vars for next round
    p=$1                        # first field of previous record
    q=$0                        # previous record
    s+=$2                       # the sum
}
END {
    printf "%s,%12.2f\n", p, s  # in the end, flush em all
}' file

Output:

2291,   382718.00,
2291,    19338.00,
2291,     9073.00,   411129.00
2292,     4707.00,     4707.00
2293,   495847.00,
...
2293,    24204.00,   917278.00

Upvotes: 1

Related Questions