Yugendra
Yugendra

Reputation: 55

Calculate sum of column using reference of other column in awk

I have a file which contains 2 column. first column contains some keyword and second contains its size. Keywords can be repeated like below:

data1 5
data2 7
data3 4
data2 6
data1 3
data2 8

I want to calculate sum of sizes which are bound with same keyword.

For example output of above data will be:

data1 8
data2 21
data3 4

Is it possible using awk?

If yes then kindly guide me.

Upvotes: 0

Views: 114

Answers (2)

Jotne
Jotne

Reputation: 41460

You can do awk with array:

awk '{a[$1]+=$2} END {for (i in a) print i,a[i]}' file
data1 8
data2 21
data3 4

How it works a[$1] this create array named a using field #1 as reference.
a[$1]+=$2 is the same as a[$1]=a[$1]+$2 add value of field #2 to the array a[$1]
for (i in a) loop trough all value in array a[$1]
print i,a[i] prints the array i and the value of array a[i]

Upvotes: 4

anubhava
anubhava

Reputation: 786091

If you want to keep output in same order as the input then use this little longer awk:

awk '$1 in a{a[$1]+=$2; next} {b[++k]=$1; a[$1]=$2}
             END{for(i=1; i<=k; i++) print b[i], a[b[i]]}' file
data1 8
data2 21
data3 4

Upvotes: 1

Related Questions