Reputation: 111
I have to parse multiple log files that look like dmesg output.
Example log file:
....
1399424400 4 abcd 2604 starting job (jobid=1325) for client abc.xyz.com, requesting resources now
RESOURCE_GRANTED 1399424400 DiskVolume=/vol;DiskPool=pool1;Path=/mypath;Server=qwer.poil.com;
....
I need to print to an output file the jobid, client, the disk volume, the diskpool, etc. so output file will look like:
1325 abc.xyz.com /vol pool1 /mypath qwer.poil.com
<file2 info>
<file3 info>
.....
I tried doing this to get the jobid:
if(@grepres=grep{/jobid/} <TRY>){
@splitres=split(' ',$grepres[0]);
$jobid=$splitres[1];
$jobid =~ s/\D//g;
Where is the fh.
But it only returns the first number in the line, ie the timestamp.
How do I get the client name or the Server name?
Is perl approrpiate for this?
Upvotes: 0
Views: 329
Reputation: 65
Perl regex will be a perfect solution for you. As it is a log file, I hope the format will not change and therefore you can easily use Perl regex. The below script can help you.
#!/usr/bin/perl
open (DATA,"<test") or print "cannot open test file";
open (DATA1,">test1") or print "cannot open test1 file";
while (<DATA>)
{
if ($_=~/.*jobid=(\d+).*client\s*(\w+\.\w+\.\w+).*DiskVolume=(\/\w+).*DiskPool=(\w+).*Path=(\/\w+).*Server=(\w+\.\w+\.\w+).*/)
{
print DATA1 "$1 $2 $3 $4 $5 $6\n";
}
}
close (DATA);
close(DATA1);
The output which I have obtained is
[root@server perl]# cat test1
1325 abc.xyz.com /vol pool1 /mypath qwer.poil.com
Upvotes: 0
Reputation: 126772
You should pull all of the data you need from each file into a hash before reformatting it.
This program starts with a list of the field names that you want to appear in the output, and builds a regex that matches those fields followed by their values.
Then all that is necessary is to find all occurrences of that pattern in all of the lines of the file and add them to the hash.
There is a final check to make sure that all the the required fields are in the hash, and then the contents are printed as a simple hash slice.
Please ask if any of this is unclear to you.
use strict;
use warnings;
my @names = qw/ jobid client DiskVolume DiskPool Path Server /;
my @files = qw/ dmesg1.txt dmesg2.txt dmesg3.txt /;
my $re = join '|', @names;
$re = qr{ \b($re)\b [\s=]+ ([\w./]+) }x;
for my $filename ( @files ) {
open my $fh, '<', $filename or do {
warn "Can't open '$filename' for reading: $!";
next;
};
my %data;
while ( my $line = <$fh> ) {
$data{$1} = $2 while $line =~ /$re/g;
}
if ( my @missing = grep { not exists $data{$_} } @names ) {
warn sprintf 'Missing %s "%s" from file "%s"',
@missing == 1 ? 'field' : 'fields',
join(', ', @missing),
$filename;
next;
}
print "@data{@names}\n";
}
output
1325 abc.xyz.com /vol pool1 /mypath qwer.poil.com
Upvotes: 1
Reputation: 8323
If the lines are the same format all the time, you can use a foreach loop and split each line as you did, while using the array to access each of the fields you want. Try this.
my @logfile = <TRY>;
close TRY;
my $jobid;
foreach my $line (@logfile) {
chomp $line; # remove trailing newline
# might be good to check for blank lines or anything invalid
if ( $line !~ /^$/ ) {
my @splitres=split(' ',$line);
$jobid=$splitres[1];
$jobid =~ s/\D//g;
# and so on with the remaining fields...
}
}
Upvotes: 1