Bhargav Sutapalli
Bhargav Sutapalli

Reputation: 31

Summing rows in a file

I want to add the rows based on one column field. Is it possible to do by awk command or any simple way?

Date    Hour  Requests   Success  Error
10-Apr  11      1           1       0
10-Apr  13      1           1       0
10-Apr  14      1           1       0
10-Apr  18      1           1       0
10-Apr  9       1           1       0
10-Apr  11      1           1       0
10-Apr  12      3           3       0
10-Apr  13      2           1       1
10-Apr  14      2           2       0
10-Apr  15      1           1       0
10-Apr  16      1           1       0
10-Apr  12      3           3       0
10-Apr  13      4           1       3
10-Apr  14      1           1       0
10-Apr  16      2           2       0
10-Apr  18      1           1       0
10-Apr  10      3           3       0
10-Apr  11      1           1       0
10-Apr  12      3           3       0
10-Apr  13      1           1       0
10-Apr  14      2           2       0
10-Apr  15      2           2       0
10-Apr  16      2           2       0
10-Apr  17      2           2       0

From the above table I want add the rows(requests, success, errors for that hour) based on hour and the o/p should be as like as below

Date   Hour  Requests Success Error
10-Apr  9       1       1       0
10-Apr  10      3       3       0
10-Apr  11      3       3       0
10-Apr  12      9       9       0
10-Apr  13      8       4       4
10-Apr  14      6       6       0
10-Apr  15      3       3       0
10-Apr  16      5       5       0
10-Apr  17      2       2       0
10-Apr  18      2       2       0

Upvotes: 0

Views: 67

Answers (1)

Ed Morton
Ed Morton

Reputation: 204064

Using GNU awk for true Multi-D arrays and sorted in:

$ cat tst.awk
NR==1 { print; next }
!seen[$1]++ { dates[++numDates] = $1 }
{ for (i=3;i<=NF;i++) sum[$1][$2][i] += $i }
END {
    PROCINFO["sorted_in"] = "@ind_num_asc"
    for (dateNr=1; dateNr<=numDates; dateNr++) {
        date = dates[dateNr]
        for (hr in sum[date]) {
            printf "%s %s ", date, hr
            for (i=3;i<=NF;i++) {
                printf "%s%s", sum[date][hr][i], (i<NF?OFS:ORS)
            }
        }
    }
}
$ awk -f tst.awk file | column -t
Date    Hour  Requests  Success  Error
10-Apr  9     1         1        0
10-Apr  10    3         3        0
10-Apr  11    3         3        0
10-Apr  12    9         9        0
10-Apr  13    8         4        4
10-Apr  14    6         6        0
10-Apr  15    3         3        0
10-Apr  16    5         5        0
10-Apr  17    2         2        0
10-Apr  18    2         2        0

I wasn't sure if your fields were space or tab separated so made no attempt to format the output within awk.

Upvotes: 3

Related Questions