Andy
Andy

Reputation: 27

Extract the IP from a log file

I am writing code that extracts all the IP addresses from a log file. (The log file contains a list of domain names, IP addresses and MAC addresses.) Here's my code:

open(CONF, '<', 'dhcpd.conf') or die "\n";
my @ip;

while(my $line = <CONF> ) {
    if ( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ ) {
        @ip = $1;
    }

    print "@ip,\n";
}

close CONF;

The problem is that each IP address is printing 5 times. The output looks like:

10.0.0.158
10.0.0.158
10.0.0.158
10.0.0.158
10.0.0.158
10.0.0.159
10.0.0.159
10.0.0.159
10.0.0.159
10.0.0.159
...

Is the problem at @ip = $1, or is it somewhere else?

Upvotes: 0

Views: 1524

Answers (4)

KBSR
KBSR

Reputation: 1

use Hash instead of Array, Try below code:

my $ips;
open(CONF, '<', 'dhcpd.conf') or die "Error: $!";
while(my $line = <CONF> ) {
    if ( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ ) {
        $ips->{$1} = 1;
    }
}
close CONF;
my $all_ips =join("\n", keys %{$ips});
print $all_ips;

Upvotes: -1

Matt Jacob
Matt Jacob

Reputation: 6553

You've got several problems, but the main one seems to be that you're printing the contents of @ip regardless of whether the line matches. If you just want to use your script as a filter and print IP addresses as you find them, this is a better way to express that:

perl -ne 'print "$1\n" if /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/' dhcpd.conf

Or the equivalent code that's not a one-liner:

use strict;
use warnings;

while (<>) {
    next unless /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/;
    print "$1\n";    
}

Which you would run like this:

$ perl script.pl dhcpd.conf

If you want to save every IP address you find and do something with them later, you'd push onto an array:

my @ips;

while (<>) {
    next unless /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/;
    push(@ips, $1);
}

# doing something else...

for (@ips) {
    print "$_\n";
}

If you only want unique IP addresses throughout the file, you'd use a hash:

my %ips;

while (<>) {
    next unless /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/;
    $ips{$1} = 1;
}

for (keys(%ips)) {
    print "$_\n";
}

Upvotes: 2

Andy
Andy

Reputation: 316

It's probably because those IPs show more than once in the log, so this is expected.

If you were doing this from a command line for say, a standard Apache log you would get similar output with:

cat log | awk '{print $1}' | sort | sort -nr -k 1 | head

That's not exactly tidy but for demo purposes, you could then have used uniq in between the sorts to remove duplicates. You'll need to do something similar in yours.

There is a module https://metacpan.org/pod/List::MoreUtils which would do this easily:

use List::MoreUtils qw(uniq);

my @ip = qw(ip1 ip2 ip3);
my @ip = uniq @ip;

If you don't want to use that module, you could create a sub like:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

my @ip = qw(ip1 ip2 ip3);
my @ip = uniq(@ip);

See perlfaq4 for more information on both these methods.

Upvotes: 0

jcaron
jcaron

Reputation: 17710

Not quite sure why you are using an array (@ip) to store a scalar, and the output (without trailing ,) does not match your script, but the reason it shows up multiple times is most probably because it appears multiple times in the log file.

If you want to skip consecutive appearances of the same address, you'll need to remember the last one seen, and not display any IP address that matches the last one seen.

If you want to only show any single IP address once, you'll need to store all the addresses in a hash (as keys), and then enumerate the keys of the hash. Or just use the hash to remember IPs you've already seen (and printed).

Upvotes: 0

Related Questions