rahuL
rahuL

Reputation: 3420

Remove certain lines from file

I have generated an output from running RKHunter into a file which looks as follows:

[04:59:24] 55808 Trojan - Variant A [ Not found ]
[04:59:24]
[04:59:24] ADM Worm [ Not found ]
[04:59:24]
[04:59:25] AjaKit Rootkit [ Not found ]
[04:59:25]

I've tried to filter out the unwanted output and this is my code thus far:

open(my $fh,$this_log);
{       while (my $line = <$fh>)
        {
                chomp $line;
                $line_ctr++;
                if ($line_ctr < 55)
                {       next;
                }
                if (index($line, "Checking") != -1)
                {       next;
                }
                if (index($line, "Info") != -1)
                {       next;
                }
                print "$line<br>";
        }
}
close $fh;

Notice the alternate lines which contain nothing. How do I go about removing them?

Upvotes: 0

Views: 81

Answers (4)

Kenosis
Kenosis

Reputation: 6204

You can keep lines with alpha characters (messages) using the following:

use strict;
use warnings;

while (<>) {
    print if /[a-z]+/i;
}

Command-line usage: perl script.pl inFile [>outFile]

The last, optional parameter directs output to a file.

Output on your dataset:

[04:59:24] 55808 Trojan - Variant A [ Not found ]
[04:59:24] ADM Worm [ Not found ]
[04:59:25] AjaKit Rootkit [ Not found ]

Or you can use the following that creates a backup (inFile.bak) of your original file:

perl -i.bak -ne 'print if /[a-z]+/i;' inFile

Hope this helps!

Upvotes: 1

chooban
chooban

Reputation: 9256

It's a bit crude, but you could incorporate a split into the processing:

#! /usr/bin/perl

use strict;
use 5.0100;

my @input = ( '[04:59:24] 55808 Trojan - Variant A [ Not found ]', '[04:59:24]', '[04:59:24] ADM Worm [ Not found ]', );

foreach my $line ( @input ) {
  my @splits = split( / /, $line );

  if ( scalar( @splits ) == 1 ) {
    say "That must have been an empty line ($line)";
  }
}

Upvotes: 0

slayedbylucifer
slayedbylucifer

Reputation: 23512

filter out the unwanted output

If this is only what you want and if you are OK with sed, then here is not so professional yet working solution:

sed -r -i '/^\[[0-9]{2}:[0-9]{2}:[0-9]{2}\]$/d' <your_file_name>

And here is perl way:

perl -i -n -e 'print unless /^\[\d{2}:\d{2}:\d{2}\]$/' <your_file_name>

Upvotes: 1

Guntram Blohm
Guntram Blohm

Reputation: 9819

They don't contain nothing, they contain the [time] part . Try something like if (length($line)==10) { next; }. You might need to change the 10 to 11, depending on whether there's an invisible space behind the time.

Upvotes: 1

Related Questions