Markus
Markus

Reputation: 69

Using awk to sum the values of a column, based on the values of another column, append the sum and percentage to original data

This question is more or less a variant on https://unix.stackexchange.com/questions/242946/using-awk-to-sum-the-values-of-a-column-based-on-the-values-of-another-column

Same input:

smiths|Login|2
olivert|Login|10
denniss|Payroll|100
smiths|Time|200
smiths|Logout|10

I would like to have the following result:

smiths|Login|2|212
olivert|Login|10|10
denniss|Payroll|100|100
smiths|Time|200|212
smiths|Logout|10|212

Hence, the sum of column 3 for all entries with the same pattern in column 1 should be appended.

In addition, append another column with the percentage, yielding the following result:

smiths|Login|2|212|0.94
olivert|Login|10|10|100
denniss|Payroll|100|100|100
smiths|Time|200|212|94.34
smiths|Logout|10|212|4.72

Upvotes: 1

Views: 384

Answers (2)

James Brown
James Brown

Reputation: 37464

Here's one that doesn't round the percentages but handles division by zero errors:

Adding to test data a couple of records:

$ cat >> file
test|test|
test2|test2|0

Code:

$ awk '
BEGIN { FS=OFS="|" }
NR==FNR { s[$1]+=$3; next }
{ print $0,s[$1],$3/(s[$1]?s[$1]:1)*100 }
' file file

Output:

smiths|Login|2|212|0.943396
olivert|Login|10|10|100
denniss|Payroll|100|100|100
smiths|Time|200|212|94.3396
smiths|Logout|10|212|4.71698
test|test||0|0
test2|test2|0|0|0

Upvotes: 3

RomanPerekhrest
RomanPerekhrest

Reputation: 92894

gawk approach:

awk -F'|' '{a[$1]+=$3; b[NR]=$0}END{ for(i in b) {split(b[i], data, FS); 
     print b[i] FS a[data[1]] FS sprintf("%0.2f", data[3]/a[data[1]]*100) }}' file

The output:

smiths|Login|2|212|0.94
olivert|Login|10|10|100.00
denniss|Payroll|100|100|100.00
smiths|Time|200|212|94.34
smiths|Logout|10|212|4.72

Upvotes: 1

Related Questions