Reputation: 1450
Wondering if someone could point me in right direction using bash shell scripting and awk to add up several column/fields and print out a summary.
I want to take stats outputted in the following format
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 1.0, TOT_REQS: 2,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 10
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 2.0, TOT_REQS: 0,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 20
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 3.0, TOT_REQS: 2,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 30
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 4.0, TOT_REQS: 1,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 40
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 5.0, TOT_REQS: 0,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 50
and total them up for one line output like
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 15.0, TOT_REQS: 5,
REQ_RATE CACHE_HITS_PER_SEC: 2.5, TOTAL_CACHE_HITS: 150
thanks
Upvotes: 1
Views: 1400
Reputation: 25032
If we consider the structure of the data, there are a few conclusions to be drawn:
REQ_RATE
carries no informationSo take a two step approach, processing the lines into cleaner key-value pairs:
sed -e 's/^REQ_RATE //' -e 's/,[[:space:]]*$//' |
awk -F ', ' -v OFS='\n' '{ $1=$1; print }'
This produces lines with single key-value pairs.
Now pipe the above through another awk stage, summing up the values for each of the keys using an associative array:
awk -F ': ' '
{
sum[$1] += $2
}
END {
for (k in sum) {
printf("%s: %d, ", k, sum[k])
}
printf("\n")
}'
I've done nothing special with the formatting of the output, instead just printing the keys in whatever arbitrary order they are iterated through. Modify the END
action if you need something more specific.
Upvotes: 1
Reputation: 77075
Will this work for you -
Your File:
[jaypal:~/Temp] cat file
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 1.0, TOT_REQS: 2,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 10
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 2.0, TOT_REQS: 0,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 20
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 3.0, TOT_REQS: 2,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 30
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 4.0, TOT_REQS: 1,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 40
REQ_RATE REQ_PROCESSING: 0, REQ_PER_SEC: 5.0, TOT_REQS: 0,
REQ_RATE CACHE_HITS_PER_SEC: 0.5, TOTAL_CACHE_HITS: 50
Test:
[jaypal:~/Temp] sed '{N;s/\n/ /g'} file |
awk -F"[:,]" '{a=a+$2;b=b+$4;c=c+$6;d=d+$8;e=e+$10}
END{printf ("%s: %.1f,%s: %.1f,%s: %.1f,\n%s: %.1f,%s: %.1f\n", $1,a,$3,b,$5,c,$7,d,$9,e)}'
REQ_RATE REQ_PROCESSING: 0.0, REQ_PER_SEC: 15.0, TOT_REQS: 5.0,
REQ_RATE CACHE_HITS_PER_SEC: 2.5, TOTAL_CACHE_HITS: 150.0
Upvotes: 1
Reputation: 161614
awk
is really easy to use.
$ awk '/REQ_PROCESSING/{x+=$3; y+=$5; z+=$7}; END{print x, y, z}' input.txt
0 15 5
I think you can do the rest. Happy coding!
Upvotes: 3