user3016638
user3016638

Reputation: 87

To sum all corresponding fields of 2nd column for each occurrence of fields in the first column

$ bpimagelist -l -d 11/01/2013 03:27:13 -e 11/01/2013 03:30:00 | awk '/^IMAGE/ {print $2, $19}'

XXclcnpde148-bak.XX 11808

XXclnXXcXXcde010-bak.XX 26400

XXcwcnpde148-bak.XX 1623072

XXcwcnpde207-bak.XX 672

XXcwcnpde207-bak.XX 672

XXcwcnpde209-bak.XX 672

XXcwcnpde209-bak.XX 672

XXcwcnpde209-bak.XX 672

-
-
-
- and continues

My output has 2 columns, i need an awk linux command to sum all corresponding fields of 2nd column for each occurrence of fields in the first column. Then print unique values of column 1 and its corresponding sum in column to.

Upvotes: 1

Views: 596

Answers (3)

Håkon Hægland
Håkon Hægland

Reputation: 40758

In Gnu Awk version 4, you can use PROCINFO["sorted_in"] to sort the result. For example:

gawk -f a.awk file

where a.awk is:

{ a[$1]+=$2 }

END {
    print "Sorted on string value of first column:"
    print "---------------------------------------"
    PROCINFO["sorted_in"] = "@ind_str_asc" 
    for (i in a) {
        print i, a[i]
    }
    print ""
    print "Sorted on numerical value of second column:"
    print "-------------------------------------------"
    PROCINFO["sorted_in"] = "@val_num_asc" 
    for (i in a) {
        print i, a[i]
    }
}

gives output:

Sorted on string value of first column:
---------------------------------------
XXclcnpde148-bak.XX 11808
XXclnXXcXXcde010-bak.XX 26400
XXcwcnpde148-bak.XX 1623072
XXcwcnpde207-bak.XX 1344
XXcwcnpde209-bak.XX 2016

Sorted on numerical value of second column:
-------------------------------------------
XXcwcnpde207-bak.XX 1344
XXcwcnpde209-bak.XX 2016
XXclcnpde148-bak.XX 11808
XXclnXXcXXcde010-bak.XX 26400
XXcwcnpde148-bak.XX 1623072

Upvotes: 0

Gregosaurus
Gregosaurus

Reputation: 611

For the sum of columns 2 with columns 1 as id :

awk '{sum2[$1] += $2}; END{ for (id in sum2) { print id, sum2[id] } }' < input

Here $1 is the id field, $2 is the column 2. We build 1 arrays for summing columns 2. Once we've processed all the lines/records, we iterate through the array keys (id strings), and print the value at that array index.

Upvotes: 2

jkshah
jkshah

Reputation: 11703

Try following awk on your result

awk '{a[$1]+=$2} END {for (x in a) print x, a[x]}' file

Output:

XXclnXXcXXcde010-bak.XX 26400
XXcwcnpde207-bak.XX 1344
XXcwcnpde148-bak.XX 1623072
XXclcnpde148-bak.XX 11808
XXcwcnpde209-bak.XX 2016

In-fact you can do the same task in single awk as follows

bpimagelist ... | awk '/^IMAGE/ {a[$2]+=$19} END {for (x in a) print x, a[x]}'

EDIT (as per OP's comment)

how to get the sorted output. sort columnn 1 with corresponding values of column 2. & also sort columnn 2 with corresponding values of column 1

Simplest approach would be to use sort

  • Sorting on column 1

    awk '{a[$1]+=$2} END {for (x in a) print x, a[x]}' file | sort -k1

    -k1 is optional since it's default behaviour

  • Sorting on column 2

    awk '{a[$1]+=$2} END {for (x in a) print x, a[x]}' file | sort -n -k2

    -n is for numerical sort since 2nd field consists of numbers

Upvotes: 1

Related Questions