Reputation: 37
I have a file like
2291, 382718.00
2291, 19338.00
2291, 9073.00
2292, 4707.00
2293, 495847.00
2293, 157310.00
2293, 63582.00
2293, 100059.00
2293, 2843.00
2293, 58597.00
2293, 14836.00
2293, 24204.00
I want to sum the second column and place the result beside the last record whenever the first column value changes.
id, amount, total
2291, 382718.00 ,
2291, 19338.00 ,
2291, 9073.00 , 411129.00
2292, 4707.00 , 4707.00
2293, 495847.00 ,
2293, 157310.00 ,
2293, 63582.00 ,
2293, 100059.00 ,
2293, 2843.00 ,
2293, 58597.00 ,
2293, 14836.00 ,
2293, 24204.00 , 917278.00
My program is adding all the second column values and giving the result at the end. How to break the totals depending on the first column values.
awk -F, '{
a[NR] = $0
sum += $2
}
END {
for (x = 1; x <= NR-1; x++) {
printf"%s\n", a[x]
}
printf"%s %s\n", a[NR],sum
}'
Upvotes: 1
Views: 66
Reputation: 133538
1st solution: Could you please try following.
awk '
BEGIN{
FS=", +"
OFS=", "
}
prev!="" && prev!=$1{
print prev_line,sum
sum=""
}
prev==$1{
print prev_line
}
{
sum+=$2
prev=$1
prev_line=$0
}
END{
if(sum){
print prev_line,sum
}
}
' Input_file
Explanation: Adding detailed explanation for above code here:
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS=", +" ##Setting FS(field separator) as comma and space.
OFS=", " ##Setting OFS(output field separator) as comma and space.
} ##Closing BLOCK for BEGIN section of this program.
prev!="" && prev!=$1{ ##Checking condition if prev is NOT NULL and prev is NOT equal to $1 of current line.
print prev_line,sum ##Printing prev_line and sum variable here.
sum="" ##Nullifying variable sum here.
} ##Closing above condition BLOCK here.
prev==$1{ ##Checking condition if variable prev is equal to $1 then do following.
print prev_line ##Printing variable prev_line here.
} ##Closing BLOCK for above condition.
{
sum+=$2 ##Creating variable sum and keep adding $2 value in its value.
prev=$1 ##Setting $1 to variable prev here.
prev_line=$0 ##Setting current line value to variable prev_line here.
}
END{ ##END BLOCK for this section is starting here.
if(sum){ ##Checking if variable sum is NOT NULL then do following.
print prev_line,sum ##Printing prev_line and sum variable here.
} ##Closing BLOCK for above condition here.
} ##Closing END BLOCK of this program here.
' Input_file ##Mentioning Input_file name here.
2nd solution: In case you have any header value then do following.
awk '
BEGIN{
FS=", +"
OFS=", "
}
FNR==1{
print
next
}
prev!="" && prev!=$1{
print prev_line,sum
sum=""
}
prev==$1{
print prev_line
}
{
sum+=$2
prev=$1
prev_line=$0
}
END{
if(sum){
print prev_line,sum
}
}
' Input_file
Upvotes: 2
Reputation: 37404
Another awk, using printf
for pretty-print:
$ awk '
NR>1 { # no output for the first record
printf "%s,",q # output the previous record
if(p==$1) # if $1 remains the same
print "" # end line
else {
printf "%12.2f\n",s # otherwise print the sum
s="" # reset the sum
}
}
{ # storing vars for next round
p=$1 # first field of previous record
q=$0 # previous record
s+=$2 # the sum
}
END {
printf "%s,%12.2f\n", p, s # in the end, flush em all
}' file
Output:
2291, 382718.00,
2291, 19338.00,
2291, 9073.00, 411129.00
2292, 4707.00, 4707.00
2293, 495847.00,
...
2293, 24204.00, 917278.00
Upvotes: 1