Reputation: 25
I have a similar input
A 3
C 1
A 4
B 2
B 2
output should be
A total=7 (3+4)
C total=1 (1)
B total=4 (2+2)
Can anyone pls tell me how to do this in awk? The input is part of an awk line output, hence the request for a solution in awk. Thanks!
Upvotes: 2
Views: 1195
Reputation: 2615
I would like to suggest another way:
sort -k 1,1 your_file |
cat - <(echo "") |
gawk '
$1==key {
line=line " + " $2; sum+=$2
}
$1 != key {
if (NR>1){print key " total=" sum " (" line ")"}
key=$1
line=$2
sum=$2
}'
What are the differences?
1) This awk does not use arrays. This is significant when working on large files.
2) This is more the AWK way, while previous answer is more like programming language way.
3) If original order is matter, you can do something like this:
gawk '{print $0 " " NR}' your_file |
sort -k 1,1 | cat - <(echo "") |
gawk '$1==key {line=line " + " $2; sum+=$2} $1 != key {if (NR>1){print nr " " key " total=" sum " (" line ")"}; key=$1; line=$2; sum=$2; nr=$NF}' |
sort -k 1,1n |
cut -d \ -f 2-
Upvotes: 1
Reputation: 40758
You can try the following code:
awk '
{
a[$1]+=$2
b[$1]=(b[$1]=="")?$2:(b[$1]"+"$2)
}
END {
for (i in a)
print i" total="a[i]" ("b[i]")"
}' file
with output:
A total=7 (3+4)
B total=4 (2+2)
C total=1 (1)
Upvotes: 0