chettyharish
chettyharish

Reputation: 1734

Perl Parsing Apache Log

I was trying to parse an apache log, but I am unable to figure out the exact regex for doing it

use strict;
use warnings;

my $log_line =
'178.255.215.79 - - [14/Jul/2013:03:27:51 -0400] 
"GET /~hines/ringworld_config/lilo.conf HTTP/1.1" 304 - "-" 
"Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)';
#to find out IP address
print( $log_line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ );
#to find out Timestamp
print( $log_line =~ /\[[\d]{2}\/.*\/[\d]{4}\:[\d]{2}\:[\d]{2}\]*/ );

#Third regex for getting the complete link here :/~hines/ringworld_config/lilo.conf

What am I doing wrong in second regex cause I keep getting only 1 in it? How to create an regex for the third requirement?

Finally I want to convert the Timestamp after retrieval to some values which I can compare and subtract . Like the Timestamp to seconfs from epoch conversion.

Upvotes: 0

Views: 830

Answers (1)

user557597
user557597

Reputation:

The second regex (timestamp) looks to be something like this:

m~\[\d{2}/[^/]*/\d{4}:\d{2}:\d{2}:\d{2}\s*-\d+\]~

expanded:

m~\[ \d{2} / [^/]* / \d{4} : \d{2} : \d{2} : \d{2} \s* - \d+ \]~x

with capture groups

m~\[ (\d{2}) / ([^/]*) / (\d{4}) : (\d{2}) : (\d{2}) : (\d{2}) \s* - (\d+) \]~x


The third regeex (link) maybe something like this:

modified link regex

m/"GET\s+([^"\s]*)\s*"/ where capture group 1 contains the link.

Upvotes: 1

Related Questions