Reputation: 43
How do I find the most common error code in web server access.log?
Upvotes: 0
Views: 471
Reputation: 875
You can try something like this:
cat /var/log/apache2/access_log | sed 's/\[.*\]//' | sort | uniq -c | awk '//{if($1>=5) print $0}' | sort -nr
The idea is to strip down things that change from line to line, like timestamps, or IP addresses, in order to aggregate the errors. In this case I've only stripped timestamps using sed
, assuming they are enclosed in square brackets. So sed 's/\[.*\]//'
will replace this \[.*\]
with nothing.
So as an example, this line:
127.0.0.1 - - [03/Oct/2016:23:45:27 +0300] "GET /favicon.ico HTTP/1.1" 200 1406
will become this:
127.0.0.1 - - "GET /favicon.ico HTTP/1.1" 200 1406
Then sort
and uniq -c
will aggregate the adjacent identical lines, and prepend the number of duplicates.
So it will look something like this:
22 127.0.0.1 - - "GET /favicon.ico HTTP/1.1" 200 1406
This means the following line (minus the stripped timestamp) has appeared 22 times in the log.
Then awk '//{if($1>=5) print $0}'
will display only the duplicate lines that appeared 5 or more times, 5 being arbitrary. And then the final sort.
This was tested on OSX and Ubuntu.
Upvotes: 1