user3185596
user3185596

Reputation: 79

Match IP address when I have white space or Carriage return in middle

I'm working on a Perl Script , where users run tracert from command line and copy that from the cli and submit the tracert in perl CGI script & my script save the tracert as a file.

Then I open file one line at a time and grep for the ip address via Perl Regex .

Problem is : sometimes users tracert result for some hops became too large , below is an example and the IP address is broken down in two parts ( with CR & LF ) , thus my regex matching fails.

4 2 ms 1 ms 1 ms routers.static-ABC.XYZ.net.in [165

.112.109.61]

I'm looking for some solution , either I can remove all the white and other space of the file , or get a regex by which I can match and IP address with ( with CR & LF ) in the middle of two lines.

If I can somehow get my program to search for "]" in the file and if I don't find this and there is a white space (CR & LF) in end of the line , then remove the white space and join that line. This will solve my problem.

Below is a part of script:

my @array;
open(my $fh, "<", "trace.txt")
or die "Failed to open file: $!\n";


while(<$fh>) { 

    &match_ip($_);  ##match_ip(@array);

} 
close $fh;

sub match_ip()
{       

    if($_ =~ m/([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})/g)
    {
        $ip = "$1\.$2\.$3\.$4";
        print "$ip\n";
    }
}

Upvotes: 0

Views: 945

Answers (5)

Miller
Miller

Reputation: 35198

It appears that the output to tracert for each hop is always starts with spacing, followed by the hop number, followed by more spacing. Even if there is wrapping, this can be used to indicate the start of a hop because no wrapped will have a number wrapped by spacing.

On the other end of the line the ip address will be the final entry, and it may or may not be enclosed by brackets []. Additionally, it's possible for the ip to actually just say 'Request timed out.'.

Given this info, I constructed this script that works with both ipv4 and ipv6 data:

use strict;
use warnings;

my $hop = '';

while (<DATA>) {
    chomp;

    # Start or End of new hop
    if ((my $startHop = $_ =~ /^\s+\d+\s+/) || /Trace complete./) {
        if ($hop =~ /^\s+(\d+).*\s\[?(\S+?)\]?$/) {
            print "IP = $1 - $2\n";
        }

        $hop = $startHop ? $_ : '';
    } elsif ($hop) {
        $hop .= $_;
    }
}

if ($hop =~ /^\s+(\d+).*\s\[?(\S+?)\]?$/) {
    print "IP = $1 - $2\n";
}

__DATA__
Tracing route to ds-any-fp3-real.wa1.b.yahoo.com [98.138.253.109]
over a maximum of 30 hops:

  1     3 ms     1 ms     1 ms  READYSHARE [192.168.1.1]
  2    33 ms    27 ms    29 ms  c-67-164-32-1.hsd1.ca.comcast.net [67.164.32.1]

  3    16 ms    11 ms    10 ms  te-7-6-ur02.oakland.ca.sfba.comcast.net [68.85.2
17.169]
  4    13 ms    12 ms    11 ms  te-0-2-0-6-ar01.oakland.ca.sfba.comcast.net [68.
87.194.230]
  5    13 ms    15 ms    21 ms  be-100-ar01.sfsutro.ca.sfba.comcast.net [68.85.1
55.18]
  6    22 ms    31 ms    24 ms  he-3-8-0-0-cr01.sanjose.ca.ibone.comcast.net [68
.86.94.85]
  7    25 ms    23 ms    23 ms  50.242.148.34
  8    86 ms    87 ms    86 ms  vlan90.csw4.SanJose1.Level3.net [4.69.152.254]
  9   111 ms    92 ms    84 ms  ae-92-92.ebr2.SanJose1.Level3.net [4.69.153.29]

 10   118 ms    85 ms    86 ms  ae-3-3.ebr1.Denver1.Level3.net [4.69.132.58]
 11    89 ms   262 ms    85 ms  ae-1-100.ebr2.Denver1.Level3.net [4.69.151.182]

 12    88 ms    92 ms    86 ms  ae-3-3.ebr1.Chicago2.Level3.net [4.69.132.62]
 13    87 ms    91 ms    88 ms  ae-1-51.edge3.Chicago3.Level3.net [4.69.138.136]

 14    92 ms    87 ms   115 ms  YAHOO-INC.edge3.Chicago3.Level3.net [4.53.96.158
]
 15   105 ms   102 ms   104 ms  ae-7.pat1.nez.yahoo.com [216.115.104.124]
 16   117 ms   120 ms   101 ms  ae-1.msr1.ne1.yahoo.com [216.115.100.5]
 17   177 ms   103 ms   108 ms  xe-7-0-0.clr2-a-gdc.ne1.yahoo.com [98.138.0.27]

 18   104 ms   104 ms   103 ms  et-18-25.fab7-1-gdc.ne1.yahoo.com [98.138.93.11]

 19   112 ms   104 ms   104 ms  po-16.bas2-7-prd.ne1.yahoo.com [98.138.240.34]
 20   104 ms   185 ms   114 ms  ir1.fp.vip.ne1.yahoo.com [98.138.253.109]

Trace complete.
##########
Tracing route to www.google.com [2607:f8b0:4003:c06::68]
over a maximum of 30 hops:

  1   466 ms   311 ms   132 ms  dsldevice6.att.net [2602:301:7766:d3a0:5ccc:b9ff
:fedb:9ea0]
  2   275 ms     *      240 ms  2602:300:c533:1510::6
  3   101 ms   469 ms   254 ms  sj2ca405me3.ipv6.att.net [2001:1890:ff:ffff:12:1
22:119:193]
  4     *        *        *     Request timed out.
  5    93 ms    45 ms    59 ms  2001:4860::1:0:7ea
  6   363 ms    72 ms    53 ms  2001:4860::8:0:6117
  7   292 ms   224 ms   189 ms  2001:4860::8:0:3427
  8    99 ms    77 ms     *     2001:4860::8:0:2c9d
  9   251 ms   167 ms   246 ms  2001:4860::8:0:64c7
 10   196 ms   274 ms   248 ms  2001:4860::2:0:5bab
 11     *        *        *     Request timed out.
 12   270 ms   202 ms   362 ms  2607:f8b0:4003:c06::68

Outputs

IP = 1 - 192.168.1.1
IP = 2 - 67.164.32.1
IP = 3 - 68.85.217.169
IP = 4 - 68.87.194.230
IP = 5 - 68.85.155.18
IP = 6 - 68.86.94.85
IP = 7 - 50.242.148.34
IP = 8 - 4.69.152.254
IP = 9 - 4.69.153.29
IP = 10 - 4.69.132.58
IP = 11 - 4.69.151.182
IP = 12 - 4.69.132.62
IP = 13 - 4.69.138.136
IP = 14 - 4.53.96.158
IP = 15 - 216.115.104.124
IP = 16 - 216.115.100.5
IP = 17 - 98.138.0.27
IP = 18 - 98.138.93.11
IP = 19 - 98.138.240.34
IP = 20 - 98.138.253.109
IP = 1 - 2602:301:7766:d3a0:5ccc:b9ff:fedb:9ea0
IP = 2 - 2602:300:c533:1510::6
IP = 3 - 2001:1890:ff:ffff:12:122:119:193
IP = 4 - out.
IP = 5 - 2001:4860::1:0:7ea
IP = 6 - 2001:4860::8:0:6117
IP = 7 - 2001:4860::8:0:3427
IP = 8 - 2001:4860::8:0:2c9d
IP = 9 - 2001:4860::8:0:64c7
IP = 10 - 2001:4860::2:0:5bab
IP = 11 - out.
IP = 12 - 2607:f8b0:4003:c06::68

Note, occasionally the ip shows as "out.". This is because the request timed out for that hop.

Upvotes: 0

Kenosis
Kenosis

Reputation: 6204

Since it's a small file, consider slurping in the entire file, globally capturing all the IPs between the [ and ], and then removing any whitespaces in those captures. The subroutine getIPsFromFile below does this:

use strict;
use warnings;

my @IPs = getIPsFromFile('trace.txt');

print "$_\n" for @IPs;

sub getIPsFromFile {
    my ($file) = @_;

    my $contents = do {
        local $/;
        open my $fh, '<', $file or die $!;
        <$fh>;
    };

    return map { s/\s+//g; $_ } $contents =~ /\[(.+?)\]/gs;
}

Hope this helps!

Upvotes: 0

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

line by line solution:

#!/usr/bin/perl

use strict;
use warnings;

while(<DATA>) {
    if (/(\d+(?:\.\d+){3})]$/) { 
        print $1 . "\n";
    } else {
        my $tmp;
        if (/\[([\d.]+)$/) {
            $tmp = $1;
        } else { next; }
        while(<DATA>) {
            if (/([\d.]+)]$/) {
                $tmp .= $1;
                print $tmp . "\n";
                last;
            } elsif (/([\d.]+)$/) {
                $tmp .= $1;
            }
        }
    }
}


__DATA__
 1    24 ms    22 ms    22 ms  1.32.202.62.cust.bluewin.ch [62.202.32.1]
 2    22 ms    24 ms    22 ms  1.32.202.62.cust.bluewin.ch [62.202.32.1]
 3    24 ms    23 ms    22 ms  net481.bwrt2zhb.bluewin.ch [195.186

 .121.1]
 4   314 ms   162 ms    22 ms  net125.bwrt1inb.bluewin.ch [195.186.125.71]
 5    34 ms    23 ms    24 ms  if114.ip-plus.bluewin.ch [195.186.0.114]
 6    27 ms    29 ms    29 ms  i68geb-005-gig4-2.bb.ip-plus.net [138.1

 87.1

 30.158]
 7    39 ms    39 ms    38 ms  i00par-005-pos4-0.bb.ip-plus.net [138.187.129.34]
 8    38 ms   320 ms    39 ms  feth2-kara-ielo.freeix.net [213.228.3.203]
 9   284 ms    39 ms    39 ms  feth0-bestelle.tlcy.fr.core.ielo.net [212.85.144.6]
10    90 ms   158 ms    83 ms  chloe.wikimedia.org [212.85.150.132] 

Upvotes: 0

Ron Bergin
Ron Bergin

Reputation: 1068

If you're looping over the file line-by-line and the IP is split across 2 lines, none of the proposed regex solutions will work. You could read the data in paragraph mode, in which case all hops will be in a single string.

Personally, I'd probably look at doing this a little differently. I'd split the line into an array. The Windows tracert command uses 8 to 9 fields for each hop. If the array has less than 8 fields, then line was "wrapped" and should be added to the previous line before parsing out the IP. This approach would require you to process 2 lines at a time in order to determine the line wrapping and join the line back together.

Upvotes: 0

Hajjat
Hajjat

Reputation: 300

Does it only happen at dots? In such a case you can do:

if($_ =~ m/([0-9]{1,3})\s*\.\s*([0-9]{1,3})\s*\.\s*([0-9]{1,3})\s*\.\s*([0-9]{1,3})/g)
{
    $ip = "$1\.$2\.$3\.$4";
    print "$ip\n";
}

Or, you can also work on the fact that IPs appear in between square brackets (if that is the case like the example you show; if not, you can still do it, but filter out non IPs at the end):

$_ =~ /\[(.*?\s*?.*?)\]/;  # extract everything between square brackets
my $ip = $1;     # take that IP
$ip =~ s/\s//g; # remove possible white spaces from the IP

Upvotes: 1

Related Questions