vkk05
vkk05

Reputation: 3222

Parsing file in Perl and store the data in Hash

I am reading a input file and store the data in an hash. Later I want to print the hash content to csv file.

Here is the script:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my %hash;

while(<DATA>){
    chomp;
    
    my ($e_id, $start, $end, $priority, $node);
    
    next unless /\S/;
    
    my ($key, $val) = split /\s*:\s*/;
    
    if($key =~ /eventId/)  { $e_id     = $val; }
    if($key =~ /startTime/){ $start    = $val; }
    if($key =~ /endTime/)  { $end      = $val; }
    if($key =~ /Node/)     { $node     = $val; }
    if($key =~ /Priority/) { $priority = $val; }
    
    $hash{$e_id}{'node'}     = $node;
    $hash{$e_id}{'start'}    = $start;
    $hash{$e_id}{'end'}      = $end;
    $hash{$e_id}{'priority'} = $priority;
}

print Dumper(\%hash);

__DATA__
Priority : High
Node : Node1
startTime : 2020-08-18T03:40:00
endTime : 2020-08-18T03:45:00
eventId : 150
Text : This is for Node1 text
eventPlace : Router1

Priority : Medium
Node : Node2
startTime : 2020-08-19T00:00:10
endTime : 2020-08-19T00:00:40
eventId : 170
Text : This is for Node2 text
eventPlace : Router2

But here hash is not printing as per the expectation. Hash's primary key should be $e_id and secondary keys are node,start,end, priority and values are been fetched from the file for their respective eventId.

I want to print hash like this:

$VAR1 = { '150' => {
                     'end' => 2020-08-18T03:45:00,
                     'priority' => High,
                     'start' => 2020-08-18T03:45:00,
                     'node' => Node1
                   },
          '170' => {
                     'end' => 2020-08-19T00:00:40,
                     'priority' => Medium,
                     'start' => 2020-08-19T00:00:10,
                     'node' => Node2
                   }
};

How can I do that. Also please suggest a suitable approach to read a file (I suspect I am doing something wrong). Because it throws warning - Use of uninitialized value $e_id in hash element at a.pl line .., <DATA> line ..

Upvotes: 0

Views: 145

Answers (4)

ikegami
ikegami

Reputation: 385565

You are creating these variables anew for each line of the file:

$e_id, $start, $end, $priority, $node

They can't be scoped to a loop that repeats for every line of the file if you want to access the values when processing later lines.

Furthermore, you assign to the fields of the record for each line of the line, including before you even populate $e_id. You don't want to assign to every fields for each line of the file, and you need to wait until you've read an entire record before assigning to $hash{$e_id}.

My solution:

my %field_map = (
   'startTime' => 'start',
   'endTime'   => 'end',
   'Node'      => 'node',
   'Priority'  => 'priority',
);

my %recs;
my $id;
my $rec = { };
while (1) {
    $_ = <DATA>;

    # If end of file or end of record.
    if (!defined($_) || $_ =~ /^$/) {
        $recs{$id} = $rec if defined($id);

        # If end of file.
        last if !defined($_);

        # Start a new record.
        $id = undef;
        $rec = { };
        next;
    }

    chomp;
    my ($key, $val) = split(/\s*:\s*/, $_, 2);

    if ( $key eq 'eventId' ) {
       $id = $val;
    }
    elsif ( $field_map{$key} ) {
       $rec->{ $field_map{$key} } = $val;
    }
}

Upvotes: 2

Polar Bear
Polar Bear

Reputation: 6798

Perl code algorithm

  • fill @records with input data by redefining $/ = "\n\n"
  • for each record split it into a %hash
  • remap %hash fields into %data hash to match desired output
  • fill %events hash
use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my @records = do{ $/ = "\n\n"; <DATA> };
my %events;

for ( @records ) {
    my(%hash,%data);
    %hash = split " : |\n";
    @data{qw/node priority start end/} = @hash{qw/Node Priority startTime endTime/};
    $events{$hash{eventId}} = \%data;
}

say Dumper(\%events);

__DATA__
Priority : High
Node : Node1
startTime : 2020-08-18T03:40:00
endTime : 2020-08-18T03:45:00
eventId : 150
Text : This is for Node1 text
eventPlace : Router1

Priority : Medium
Node : Node2
startTime : 2020-08-19T00:00:10
endTime : 2020-08-19T00:00:40
eventId : 170
Text : This is for Node2 text
eventPlace : Router2

Output

$VAR1 = {
          '170' => {
                     'start' => '2020-08-19T00:00:10',
                     'end' => '2020-08-19T00:00:40',
                     'node' => 'Node2',
                     'priority' => 'Medium'
                   },
          '150' => {
                     'node' => 'Node1',
                     'priority' => 'High',
                     'end' => '2020-08-18T03:45:00',
                     'start' => '2020-08-18T03:40:00'
                   }
        };

Upvotes: 1

TLP
TLP

Reputation: 67900

Hard coding the names of the entries in your file is not necessary. You can read the entire entry into a hash right away as you read the file, using a very simple loop. This is assuming each record is separated by a blank line.

use strict;
use warnings;
use Data::Dumper;

$/ = "";
my %data;

while(<DATA>) {
    my $rec = { split /\n| : /, $_ };
    $data{$rec->{eventId}} = $rec;
}
print Dumper \%data;


__DATA__
Priority : High
Node : Node1
startTime : 2020-08-18T03:40:00
endTime : 2020-08-18T03:45:00
eventId : 150
Text : This is for Node1 text
eventPlace : Router1

Priority : Medium
Node : Node2
startTime : 2020-08-19T00:00:10
endTime : 2020-08-19T00:00:40
eventId : 170
Text : This is for Node2 text
eventPlace : Router2

This will print:

$VAR1 = {
          '170' => {
                     'endTime' => '2020-08-19T00:00:40',
                     'eventPlace' => 'Router2',
                     'startTime' => '2020-08-19T00:00:10',
                     'Node' => 'Node2',
                     'Priority' => 'Medium',
                     'eventId' => '170',
                     'Text' => 'This is for Node2 text'
                   },
          '150' => {
                     'endTime' => '2020-08-18T03:45:00',
                     'eventPlace' => 'Router1',
                     'startTime' => '2020-08-18T03:40:00',
                     'Node' => 'Node1',
                     'Priority' => 'High',
                     'eventId' => '150',
                     'Text' => 'This is for Node1 text'
                   }
        };

Upvotes: 3

choroba
choroba

Reputation: 241758

If you want to use the variables like $node when reading a different line, you need to declare them outside the while loop. Otherwise, the my declaration clears the values from the previous lines. Just move the my line before the while one.

Also, you only want to populate the hash once the information is complete. Wrap the assignments to $hash{$e_id} to

if ($key eq 'eventPlace') {
     ...
}

Upvotes: 2

Related Questions