Reputation: 135
I have a log file that I essentially cat the entire log out, cut it up strings until there are only two fields remaining, sort uniquely on field one and sum up field to the right when field numbers and field1 are the same. Example
80 128
443 40
80 100
25 20
443 44
80 128
The results would be
80 356
443 84
25 20
The issue I am having is there is an inconsistency in the first field that I cut out where sometimes output will look like:
80 128
and sometimes it is
80(LOCAL\randomuser) 128
So my output ends up looking like
80 356
80(LOCAL\randomuser) 128
443 84
25 20
this is because 80(LOCAL\randomuser) is a unique line.
How can I first normalize the first field so that (LOCAL\randomuser) is removed from lines where exist while lines that do not have (LOCAL\randomuser) remain the same.
Upvotes: 0
Views: 151
Reputation: 9936
Or force $1
into numerical context:
awk '{A[$1+0]+=$NF} END{for (i in A) print i, A[i]}' file
Upvotes: 1
Reputation: 45670
Use
awk '{a[$1]+=$NF} END{for (i in a) print i, a[i]}' input
I.e use first field as key and add the last field.
If there isn't a space between the first number and the (
as it looks like in your example, tell awk to split on (
as well:
awk -F"[ (]+" '{a[$1]+=$NF} END{for (i in a) print i, a[i]}' input
output:
$ awk '{a[$1]+=$NF} END{for (i in a) print i, a[i]}' input
25 20
80 356
443 84
Another approach to remove the just the (LOCAL\randomuser)
if present, you can use sed
:
sed 's/(.*)//' input
Upvotes: 1
Reputation: 77089
grep -v
will match lines that do not contain a pattern. Pipe the output through grep
before it reaches cut
.
Upvotes: 0