Reputation: 53
I have a file that has a standard input but in a form that I haven't tried to read into a Perl program before.
The format of the file is this:
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 30
Type IP
Status InUse
Description mpirpd-cjdn
Notes mgmt
Entry-Id 000000000026450
Submitter John Doe
Create-date 2009-07-01-13:55:24
Contact-Data INTERNAL/555-555-5555
Contact-Id CON-000028508
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 32
Type IP
Status InUse
Description switch Lo0 -- switch unnamed
Notes Reverved for Lan Management Loop Backs and links
Entry-Id 000000000032710
Submitter John Doe
Create-date 2015-11-25-10:59:27
Last-modified-by John Doe
Modified-date 2015-11-25-11:30:06
Contact-Data INTERNAL/555-555-5555
Contact-Id CON-000028508
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 32
Type IP
Status InUse
Description mplsfe9-hub
Area mpls
Entry-Id 000000000024150
Submitter Russ Reilly
Create-date 2007-05-02-18:26:20
Last-modified-by John Doe
Modified-date 2013-05-06-19:09:37
Contact Name ITG INTERNAL
Contact Phone 555-555-5555
Contact E-mail [email protected]
Not all of the fields are always used (example: Contact Name and Contact Phone could be missing in the next record).
I don't necessarily need the field headings as they are consistently in the same location for each record.
I am sure this has been done before and probably has a simple solution so I am asking the question before I recreate the wheel.
Upvotes: 0
Views: 97
Reputation: 6553
I would recommend an array of hashes as the ideal data structure for the file you've presented.
We set the input record separator to ''
to treat two or more consecutive empty lines as a single empty line. Then, within each record, we just split
each line by two or more spaces, which preserves your keys that contain spaces. The split
is limited to 2 fields total to prevent additional fields from being created for values that contain two or more consecutive spaces (e.g., ITG INTERNAL
).
use strict;
use warnings;
use Data::Dump;
local $/ = '';
my @data;
while (<DATA>) {
chomp;
next if $_ eq 'Net Number Assignments';
my %record;
for my $line (split(/\n/)) {
my ($key, $value) = split(/\s\s+/, $line, 2);
$record{$key} = $value;
}
push(@data, \%record);
}
dd(\@data);
__DATA__
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 30
Type IP
Status InUse
Description mpirpd-cjdn
Notes mgmt
Entry-Id 000000000026450
Submitter John Doe
Create-date 2009-07-01-13:55:24
Contact-Data INTERNAL/555-555-5555
Contact-Id CON-000028508
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 32
Type IP
Status InUse
Description switch Lo0 -- switch unnamed
Notes Reverved for Lan Management Loop Backs and links
Entry-Id 000000000032710
Submitter John Doe
Create-date 2015-11-25-10:59:27
Last-modified-by John Doe
Modified-date 2015-11-25-11:30:06
Contact-Data INTERNAL/555-555-5555
Contact-Id CON-000028508
Net Number Assignments
Number xxx.xxx.xxx.xxx
Netmask in /## Form 32
Type IP
Status InUse
Description mplsfe9-hub
Area mpls
Entry-Id 000000000024150
Submitter Russ Reilly
Create-date 2007-05-02-18:26:20
Last-modified-by John Doe
Modified-date 2013-05-06-19:09:37
Contact Name ITG INTERNAL
Contact Phone 555-555-5555
Contact E-mail [email protected]
Output:
[
{
"Contact-Data" => "INTERNAL/555-555-5555",
"Contact-Id" => "CON-000028508",
"Create-date" => "2009-07-01-13:55:24",
"Description" => "mpirpd-cjdn",
"Entry-Id" => "000000000026450",
"Netmask in /## Form" => 30,
"Notes" => "mgmt",
"Number" => "xxx.xxx.xxx.xxx",
"Status" => "InUse",
"Submitter" => "John Doe",
"Type" => "IP",
},
{
"Contact-Data" => "INTERNAL/555-555-5555",
"Contact-Id" => "CON-000028508",
"Create-date" => "2015-11-25-10:59:27",
"Description" => "switch Lo0 -- switch unnamed",
"Entry-Id" => "000000000032710",
"Last-modified-by" => "John Doe",
"Modified-date" => "2015-11-25-11:30:06",
"Netmask in /## Form" => 32,
"Notes" => "Reverved for Lan Management Loop Backs and links",
"Number" => "xxx.xxx.xxx.xxx",
"Status" => "InUse",
"Submitter" => "John Doe",
"Type" => "IP",
},
{
"Area" => "mpls",
"Contact E-mail" => "me\@home.com",
"Contact Name" => "ITG INTERNAL",
"Contact Phone" => "555-555-5555",
"Create-date" => "2007-05-02-18:26:20",
"Description" => "mplsfe9-hub",
"Entry-Id" => "000000000024150",
"Last-modified-by" => "John Doe",
"Modified-date" => "2013-05-06-19:09:37",
"Netmask in /## Form" => 32,
"Number" => "xxx.xxx.xxx.xxx",
"Status" => "InUse",
"Submitter" => "Russ Reilly",
"Type" => "IP",
},
]
Upvotes: 2
Reputation: 86774
This is conceptually simple but somewhat tedious. The canonical version of this type of parsing solution looks like this:
#!/usr/bin/perl
my $all = {}; # A hash to hold all number entries indexed by IP
my $cur = {}; # A hash to hold the current entry we are parsing
while(<>)
{
chomp;
if (my ($ip) = /^Number\s+(.*)/)
{
# If we have a current entry, save it in the $all hash
$all->{$cur->{number}} = $cur if ($cur->{number});
$cur = {};
$cur->{number} = $ip;
}
elsif (my ($mask) = /^Netmask in \/## Form\s+(\d+)/)
{
$cur->{mask} = $mask;
}
elsif ... # Handle remaining input line types, saving what you want in $cur
}
# This is to save the last entry
$all->{$cur->{number}} = $cur if ($cur->{number});
# Your code to process the accumulated entries
...
Upvotes: 1