Reputation: 55
I'm trying to grep a specific line with domain from Apache2 access.log. In my access.log I have all my virtual hosts and different domains.
cat/var/log/access.log:
www.something-else-domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image.jpg HTTP/1.1" 304 - "www.something-else-domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"
www.domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image. jpg HTTP/1.1" 304 - "www.domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"
domain.si:80 193.77.xxx. xxx - - [06/Nov/2013:12:21:45 +0100] "GET /path/to/dir/image. jpg HTTP/1.1" 304 - "www.domain.si/index.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0"
I would want to grep only the domain.si and www.domain.si and whatever.domain.si and not something-else-domain.si. How could I do that? Thanks for help.
Upvotes: 0
Views: 1553
Reputation: 182063
egrep '^([^ ]*\.)?domain\.si' /var/log/access.log
Taking this apart:
^
is the beginning of the line.(xxx)?
is "match xxx
or nothing"; in this case, match either:
domain.si
)[^ ]*\.
, any string of characters that are not spaces, followed by a dot. This matches the optional www.
or whatever.
part.domain\.si
simply matches the domain.si
part.The anchoring with ^
, along with the "no spaces" bit, ensures that you only match things at the beginning of the line (not requests like GET /domain.si
).
Upvotes: 2
Reputation: 41460
A gnu awk
solution
awk '/www.domain$|domanin$/ {print $NF RS}' RS=".si"
www.domain.si
"www.domain.si
"www.domain.si
There is a problem in your example. space are not allowed in url
Upvotes: 0