Reputation: 27
I am trying to open a log file, search it against a list of keywords, print every line that contained that keyword, and then compress the results file into a .gz.
I have come up with the code below, which starts running with no compilation errors. It writes to the results file, but when I run the script it never completes and it never finds any results. Any help?
#!/usr/bin/perl
use IO::Uncompress::Gunzip qw($GunzipError);
use IO::Compress::Gzip qw(gzip $GzipError) ;
use diagnostics;
use strict;
use warnings;
my %LOGLINES = ();
my %count = ();
open(FILE, "</data/bro/scripts/Keywords.txt");
my %keywords = map { chomp $_; $_, 1 } <FILE>;
close(FILE);
my $logfile = IO::Uncompress::Gunzip->new( "/data/bro/logs/2016-05-05/http.00:00:00-06:00:00.log.gz" )
or die "IO::Uncompress::Gunzip failed: $GunzipError\n";
open(FILE, "+>Results.txt");
my @results = <FILE>;
foreach my $line ($logfile) {
while (<>) {
my @F=split("\t");
next unless ($F[2] =~ /^(199|168|151|162|166|150)/);
$count{ $F[2] }++;
if ($count{ $F[2] } == 10) {
print @{ $LOGLINES{$F[2]} }; # print all the log lines we've seen so far
print $_; # print the current line
} elsif ($count{ $F[2] } > 10) {
print $_; # print the current line
} else {
push @{ $LOGLINES{$F[2]} }, $_; # store the log line for later use
}
my $flag_found = grep {exists $keywords{$_} } split /\s+/, $line;
print $line if $flag_found;
}
}
IO::Compress::Gzip("results.gz")
or die "IO::Compress::Gunzip failed: $GzipError\n";
close(FILE);
Upvotes: 0
Views: 124
Reputation: 5055
Probably there is no need in while (<>)
loop in your script, because this line involves the input from keyboard.
The object $logfile
returned by IO::Uncompress::Gunzip->new
constructor can be handled like normal filehandle, so you could just do while (<$logfile>)
like:
use IO::Uncompress::Gunzip qw($GunzipError);
use IO::Compress::Gzip qw(gzip $GzipError) ;
use strict;
use warnings;
use feature 'say';
#...
my @loglines;
open my $fh, '</data/bro/scripts/Keywords.txt' or die "$!";
my %keywords = map { chomp; $_ => 0 } <$fh>;
close $fh;
my $logfile = IO::Uncompress::Gunzip->new( "..." )
or die "IO::Uncompress::Gunzip failed: $GunzipError\n";
while (<$logfile>) {
my @line = split /\t/;
next if ! $line[2];
for my $key (keys %keywords) {
if ($line[2] =~ /^$key/) { $keywords{$key}++; push @loglines, $_; say; last }
}
}
# ... pack using gzip
So the @loglines
array contains all lines from log, which contains one of your keywords at the beggining of third ($line[2]
) splitted by '\t' substring. The %keywords
hash contains keywords as keys and their frequencies of occurrence as values.
NOTES(Edit): You can store loglines in hash, where each key could be a keyword and each value - an array/hash of matched lines (or substrings or both). I simply push matched lines into array just for example. You can do it as you need and then pack it with gzip in convenient way.
Also it's better not to use the global names like FILE
because in such case there is a risk to have other code using it by accident. Besides verify that you've successfully opened filehandle e.g. with or die
as in example.
Upvotes: 3
Reputation: 446
IO::Uncompress::Gunzip->new returns an IO::Uncompress::Gunzip object.
foreach my $line ($logfile) {
while (<>) {
...
}
}
makes no sense, it just sets $line to the IO::Uncompress::Gunzip object and then waits for keyboard input.
Instead try:
while (my $line = <$logfile>) {
...
}
You are also not using IO::Compress::Gzip correctly. You can create the IO::Compress::Gzip object before you process the logfile and use it with the print. Something like the following should work:
...
my $z = IO::Compress::Gzip->new("results.gz")
or die "IO::Compress::Gunzip failed: $GzipError\n";
while (my $line = <$logfile>) {
my @F=split("\t", $line);
next unless ($F[2] =~ /^(199|168|151|162|166|150)/);
$count{ $F[2] }++;
if ($count{ $F[2] } == 10) {
print $z @{ $LOGLINES{$F[2]} }; # print all the log lines we've seen so far
print $z $line; # print the current line
} elsif ($count{ $F[2] } > 10) {
print $z $line; # print the current line
} else {
push @{ $LOGLINES{$F[2]} }, $_; # store the log line for later use
}
my $flag_found = grep {exists $keywords{$_} } split /\s+/, $line;
print $z $line if $flag_found;
}
You should look at the documentation for IO::Uncompress::Gunzip and IO::Compress::Gzip (using perldoc or at cpan.org). It shows examples of the correct usage of these modules.
Upvotes: 1