shivangi bohra
shivangi bohra

Reputation: 21

Perl script to extract data from log file between two date ranges, not necessary input date exist in the file

I have a log file and each line have a timestamp as shown below. I need to get data between two dates. For example, fetch data between Aug 9 16:24:23 and Aug 9 16:28:00 even though they dont lie in the file.

Aug  9 16:24:21 linux-447z dbus-daemon[685]: 
Aug  9 16:24:21 linux-447z dbus[685]: [system] Activating service 
Aug  9 16:24:21 linux-447z dbus-daemon[685]: 
Aug  9 16:24:21 linux-447z dbus-daemon[685]: dbus[685]: [system] 
Aug  9 16:24:21 linux-447z dbus[685]: [system] Successfully activated 
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: renewing lease of 192.168.37.128
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: leased 192.168.37.128 for 1800 seconds
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding IP address 192.168.37.128/24
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding 
Aug  9 16:27:47 linux-447z dhcpcd[3422]: eth0: Failed to lookup
Aug  9 16:27:47 linux-447z ifup:     eth0      
Aug  9 16:27:48 linux-447z SuSEfirewall2:   
Aug  9 16:29:03 linux-447z dbus[685]: [system] Activating service 

Upvotes: 2

Views: 1818

Answers (3)

Steffen Ullrich
Steffen Ullrich

Reputation: 123270

It should be simple if you have log files where time stamps have a known or clearly marked timezone, e.g. use Time::Piece (see other answer for this question for an example) or for simpler requirements (e.g. time stamps in file and for grep are in same time zone) Time::Local.

But, with your example files (e.g. typical syslog files) it gets hairy because you have neither timezone information, nor the year and if these are syslog files they usually use (but not log) the current timezone which can change while logging, especially if you have daylight savings time.

I once worked on a project, where the old syslog format (e.g. local time with neither timezone nor year) was a requirement. To determine the timezone and year of the start of the log we put a special log entry on top of the new file after each rotate. We also made sure that a log entry was written at least once an hour so that we could detect changes to daylight savings time and back. Lots of workarounds for broken time stamping :(

Upvotes: 0

ThisSuitIsBlackNot
ThisSuitIsBlackNot

Reputation: 24063

You can use Time::Piece (a core module since Perl 5.10) to do date parsing and comparisons:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

use Time::Piece;

my $format = '%b %e %T';
my $start = Time::Piece->strptime('Aug  9 16:24:23', $format);
my $end   = Time::Piece->strptime('Aug  9 16:28:00', $format);

while (<DATA>) {
    my ($timestamp) = /(^\w+\s+\d+\s+\d\d:\d\d:\d\d)/;
    my $t = Time::Piece->strptime($timestamp, $format);

    print if $t >= $start && $t <= $end;
}

__DATA__
Aug  9 16:24:21 linux-447z dbus-daemon[685]:
Aug  9 16:24:21 linux-447z dbus[685]: [system] Activating service
Aug  9 16:24:21 linux-447z dbus-daemon[685]:
Aug  9 16:24:21 linux-447z dbus-daemon[685]: dbus[685]: [system]
Aug  9 16:24:21 linux-447z dbus[685]: [system] Successfully activated
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: renewing lease of 192.168.37.128
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: leased 192.168.37.128 for 1800 seconds
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding IP address 192.168.37.128/24
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding
Aug  9 16:27:47 linux-447z dhcpcd[3422]: eth0: Failed to lookup
Aug  9 16:27:47 linux-447z ifup:     eth0
Aug  9 16:27:48 linux-447z SuSEfirewall2:
Aug  9 16:29:03 linux-447z dbus[685]: [system] Activating service

Output:

Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: renewing lease of 192.168.37.128
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: leased 192.168.37.128 for 1800 seconds
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding IP address 192.168.37.128/24
Aug  9 16:27:46 linux-447z dhcpcd[3422]: eth0: adding 
Aug  9 16:27:47 linux-447z dhcpcd[3422]: eth0: Failed to lookup
Aug  9 16:27:47 linux-447z ifup:     eth0      
Aug  9 16:27:48 linux-447z SuSEfirewall2:  

Upvotes: 5

amaslenn
amaslenn

Reputation: 805

You can parse each line storing data in hash: convert date to int (just remove ':' for you sample of data) and use it as key, value should be an array since you have multiple data at the same time. Converting border values to numbers will help you to limit result output.

Upvotes: 0

Related Questions