begginer
begginer

Reputation: 83

How to filter apache access logs on the basis of ips, domain and url?

i have to filter out group of same ips , domain, and some url pattern and print output as well along with count, domain, and url pattern from my apache access logs.| Currently i am using awk command but is shows only count and ip's not domain and url patterns.

My input is

Feb  2 03:15:01 lb2 haproxy2[30529]: "www.abc.com" 207.46.13.4 02/Feb/2020:03:15:01.668 GET /detail.php?id=166108259 HTTP/1.1 GET 404 123481 "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "" ci-https-in~ ciapache ci-web1 0 0 1 71 303 762 263 1 1 -- "" "" "" ""

Feb  2 03:15:02 lb2 haproxy2[30530]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:01.987 GET /listing.php?id=1009 HTTP/1.1 GET 200 182 "Mozilla/5.0 (Linux; Android 5.1.1; LG-K420 Build/LMY47V) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36" "https://wap.abc.com/s.php?q=land+buyers" ci-https-in~ ciapache ci-web2 0 0 0 18 18 17813 219 0 0 -- "" "" "" ""

Feb  2 03:15:02 lb2 haproxy2[30531]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:02.067 GET /listing.php?id=6397 HTTP/1.1 GET 200 128116 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "" ci-https-in~ varnish ci-van 0 0 0 1 3 470 1001 0 0 -- "" "" "" ""

Feb  2 03:15:02 lb2 haproxy2[30531]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:02.067 GET /listing.php?id=6397 HTTP/1.1 GET 200 128116 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "" ci-https-in~ varnish ci-van 0 0 0 1 3 470 1001 0 0 -- "" "" "" ""

Expected output

count  ip             domain     url
2     106.76.245.226 wap.abc.com /listing.php?id=6397
1     106.76.245.226 wap.abc.com /listing.php?id=1009
1     207.46.13.4    www.abc.com /detail.php?id=166108259

currently i am using this command but it is not giving expected output

cat /var/log/httpd/access_log | grep www.abc.com* | awk '{print $7}' |  sort -n | uniq -c | sort -rn | head -n 50

Upvotes: 2

Views: 4164

Answers (1)

grep www.abc.com* /var/log/httpd/access_log | awk '{print $7,$6,$10}' |  sort -n | uniq -c | sort -rn | head -n 50

use other columns as well in awk.

Upvotes: 3

Related Questions