user1039417
user1039417

Reputation: 109

Perl regex multi-line match into hash

I'm successfully parsing a cisco config file, and grabbing the sections of config between each marker (cisco uses the ! symbol) using a multi line regex of:

/(search string)/i .. /^!/ 

My code looks like:

#!/usr/bin/perl -w
use strict;
use Data::Dumper;

my (@results, @data) ;

#Test data to simulate a while loop on a file-handle running through a config file.
@data =  (
    "vlan 81" ,
    " name Vlan 81 test1" ,
    "!" ,
    "vlan 82" ,
    " name Vlan 82 test2" ,
    "!" ,
    "vlan 83" ,
    " name Vlan 83 test3" ,
    "!"
);

foreach ( @data ) {
    if ( /vlan/i .. /^!/ ) {
         push  (@results , $_) ;                
    }
}

print Dumper ( @results ) . "\n" ;

exit;

It works really well, but I want to push the results into a hash, with each section of code being an anonymous array, so the results would look something like:

%Vlan -> [Vlan 81, name Vlan 81 test1] , [Vlan 82, name Vlan 82 test2] , [Vlan 83, name Vlan 83 test3]

But I can't work out how to do it, my code matches per line between the search string and the marker and I just end up rebuilding the results into another array, line by line.

Any help is much appreciated.

Cheers,

Andy

Upvotes: 4

Views: 1617

Answers (3)

Greg Bacon
Greg Bacon

Reputation: 139711

Change the end of your program to

my %Vlan;

for (@data) {
  if (my $inside = /vlan/i .. /^!/) {
    if ($inside =~ /E0$/) {
      s/^\s+//, s/\s+$// for @results;  # trim whitespace
      $Vlan{ $results[0] } = join ", ", @results;
      @results = ();
    }
    else {
      push @results, $_;
    }
  }
}

print Dumper \%Vlan;

The .. range operator returns a value that ends with "E0" when the right-hand condition is true, so we can use it as a cue for when to drop a new entry into %Vlan.

The value returned is either the empty string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string "E0" appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint.

Your end goal isn’t clear, but it seems that you want the hash values to be strings rather than arrays. Perl’s join creates a string by intercalating some separator between elements from a list of values. The code above removes leading and trailing whitespace from each value in @results before using them to populate %Vlan.

Output:

$VAR1 = {
          'vlan 81' => 'vlan 81, name Vlan 81 test1',
          'vlan 83' => 'vlan 83, name Vlan 83 test3',
          'vlan 82' => 'vlan 82, name Vlan 82 test2'
        };

Upvotes: 3

Borodin
Borodin

Reputation: 126772

I'm not sure what you mean about a hash, as the contents you describe are just a list of anonymous arrays. There are no keys so you can only produce an array. If you can explain which part of the data is to be the key then we can go for a hash.

The use warnings pragma is preferable to the -w shebang modifier as it is more flexible and can be negated.

The range operator .. may be cute but you mustn't squeeze it into use wherever possible.

Setting the input separator to "!\n" will allow you to read in all related lines at once, which can then be pushed onto your array.

The code looks like this

use strict;
use warnings;

use Data::Dumper;

my @Vlan;

$/ = "!\n";

while  (<DATA>) {
  chomp;
  push @Vlan, [split /[\r\n]+/];
}

print Data::Dumper->Dump([\@Vlan], ['*Vlan']);

__DATA__
vlan 81
name Vlan 81 test1
!
vlan 82
name Vlan 82 test2
!
vlan 83
name Vlan 83 test3
!

output

@Vlan = (
          [
            'vlan 81',
            'name Vlan 81 test1'
          ],
          [
            'vlan 82',
            'name Vlan 82 test2'
          ],
          [
            'vlan 83',
            'name Vlan 83 test3'
          ]
        );

EDIT

If the key of the hash is always the first line of the record set, then this program produces a hash as you requested

use strict;
use warnings;

use Data::Dumper;

my %Vlan;

$/ = "!\n";

while  (<DATA>) {
  chomp;
  my ($k, $v) = split /[\r\n]+/;
  $Vlan{$k} = $v;
}

print Data::Dumper->Dump([\%Vlan], ['*Vlan']);

__DATA__
vlan 81
name Vlan 81 test1
!
vlan 82
name Vlan 82 test2
!
vlan 83
name Vlan 83 test3
!

output

%Vlan = (
          'vlan 81' => 'name Vlan 81 test1',
          'vlan 83' => 'name Vlan 83 test3',
          'vlan 82' => 'name Vlan 82 test2'
        );

Upvotes: 5

perreal
perreal

Reputation: 98118

This one keeps a state instead of doing multi-line:

my %Vlan;

#Test data to simulate a while loop on a file-handle running through a config file.
@data =  (
    "vlan 81" ,
    " name Vlan 81 test1" ,
    "!" ,
    "vlan 82" ,
    " name Vlan 82 test2" ,
    "!" ,
    "vlan 83" ,
    " name Vlan 83 test3" ,
    "!"
);

foreach ( @data ) {
    if (/ name (\w+ \d+) /) {
      my $name = lc $1;
      die("undef $name") if (not defined $Vlan{$name});
      $Vlan{$name} = [$name, $_];
    } elsif ( /^(\w+ \d+)$/ ) {
      my $name = lc $1;
      $Vlan{$name}++;
    }
}

print Dumper ( %Vlan ) . "\n" ;

exit;

Upvotes: 2

Related Questions