Hedley Phillips
Hedley Phillips

Reputation: 389

Perl: pattern match a string and then print next line/lines

I am using Net::Whois::Raw to query a list of domains from a text file and then parse through this to output relevant information for each domain.

It was all going well until I hit Nominet results as the information I require is never on the same line as that which I am pattern matching.

For instance:

Name servers:
ns.mistral.co.uk 195.184.229.229

So what I need to do is pattern match for "Name servers:" and then display the next line or lines but I just can't manage it.

I have read through all of the answers on here but they either don't seem to work in my case or confuse me even further as I am a simple bear.

The code I am using is as follows:

   while ($record = <DOMAINS>) {
     $domaininfo = whois($record);

    if ($domaininfo=~ m/Name servers:(.*?)\n/){
    print "Nameserver: $1\n";
      }

}

I have tried an example of Stackoverflow where

<DOMAINS>;

will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.

EDIT: Forgot to say thanks! how rude.

Upvotes: 0

Views: 3782

Answers (2)

canavanin
canavanin

Reputation: 2719

This is half a question and perhaps half an answer (the question's in here as I am not yet allowed to write comments...). Okay, here we go:

Name servers:
ns.mistral.co.uk 195.184.229.229

Is this what an entry in the file you're parsing looks like? What will follow immediately afterwards - more domain names and IP addresses? And will there be blank lines in between?

Anyway, I think your problem may (in part?) be related to your reading the file line by line. Once you get to the IP address line, the info about 'Name servers:' having been present will be gone. Multiline matching will not help if you're looking at your file line by line. Thus I'd recommend switching to paragraph mode:

{
   local $/ = ''; # one paragraph instead of one line constitutes a record
   while ($record = <DOMAINS>) {
      # $record will now contain all consecutive lines that were NOT separated
      # by blank lines; once there are >= 1 blank lines $record will have a
      # new value

      # do stuff, e.g. pattern matching
   }
}

But then you said

I have tried an example of Stackoverflow where <DOMAINS>; will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.

so maybe you've already tried what I have just suggested? An alternative would be to just add another variable ($indicator or whatever) which you'll set to 1 once 'Name servers:' has been read, and as long as it's equal to 1 all following lines will be treated as containing the data you need. Whether this is feasible, however, depends on you always knowing what else your data file contains.

I hope something in here has been helpful to you. If there are any questions, please ask :)

Upvotes: 1

David W.
David W.

Reputation: 107090

So, the $domaininfo string contains your domain?

What you probably need is the m parameter at the end of your regular expression. This treats your string as a multilined string (which is what it is). Then, you can match on the \n character. This works for me:

my $domaininfo =<<DATA;
Name servers:
ns.mistral.co.uk 195.184.229.229
DATA

$domaininfo =~ m/Name servers:\n(\S+)\s+(\S+)/m;
print "Server name = $1\n";
print "IP Address = $2\n";

Now, I can match the \n at the end of the Name servers: line and capture the name and IP address which is on the next line.

This might have to be munged a bit to get it to work in your situation.

Upvotes: 2

Related Questions