programminglearner
programminglearner

Reputation: 542

Perl Parsing Without Package

I have a csv file containing data that I would like to parse and store into some data structure to print onto the screen. I don't have options to install any packages or modules that aren't pre-installed. I am familiar with CSV text mod but cannot use it so I have to do this manually.

The data looks like this:

Name,Age,Weight,Target  
April,     23,    134,    90  
Jenna,     45,    156,    90  
Matt,      12,    90,     90  
Aaron,     34,    190,    90  
Daniel,    22,    188,    90  

Here is what I have so far, but it simply stores all the data into an array and prints it out.

use strict;
use warnings;
use Data::Dumper;

my $file = "file.csv";

my %people;
my @data;

open my $fh, $file or die "Could not open $file: $!";
while (my $line = <$fh>) {
    chomp $line;
    my @fields = split(/,/, $line);
    push @data, @fields;
}
close $fh;

print join(", ", @data);

This gives an output like:

Name, Age, Weight, Target, April        ,          23,       134,     90, 

The spacing is due to the csv columns being spaced out. The header line has no spaces. I would like a more organized way of storing each columns values and then printing them out on the screen.

Upvotes: 1

Views: 154

Answers (3)

zdim
zdim

Reputation: 66964

my work is very strict about using anything that is not pre-installed.

Ah well. There's a lot that can be said about that, some of it mentioned in comments. But I'd leave it at this point since the question is quite clear and articulate on that.

If your data is always like shown then things are easy. But I suggest to also add code that checks for gremlins in your data, things that would throw off manual parsing; a pre-processing check of sorts. So that you get warned when that happens.

Having said that, and with a nice use of formats in another answer, I'd like to comment on the code.

The problem is that the line

push @data, @fields;

evaluates @fields into a list of its elements and then adds those elements to the array -- it does not somehow "add the array" @fields as a single entity, which I presume is what you expected. So as it keeps going through lines it keeps building that loooong array, with all data in one long flat list.

Instead, add a reference to the @fields array

while (my $line = <$fh>) {
    chomp $line;
    my @fields = split /\s*,\s*/, $line;
    push @data, \@fields;
}

where I've also pruned spaces, once we're at it. (The CSV shouldn't have them at all, actually.)

Here we can nicely just take a reference of @fields becuase it's declared anew for each iteration. If it were declared elsewhere and merely overwritten in each iteration then you'd have to have it copied (into an anonymous array) instead

while (my $line = <$fh>) {
    chomp $line;
    @fields = split /\s*,\s*/, $line;   # if @fields is declared outside
    push @data,  [ @fields ];
}

or you'd end up with the same reference for all elements of @data.

Now elements of @data are references to rows and can be processed individually. For example

use List::Util qw(max);  

my $max_name_wt = max map { length $_->[0] } @data;

printf "%${max_name_wt}s %6s %6s %6s\n", @{ shift @data };  # headers

foreach my $row (@data) {
    printf "%${max_name_wt}s %6d %6d %6d\n", @$row;
}

This assumes that numbers are all integer with at most 6 digits. It also assumes that no fields are missing, or their undef would draw warnings in printf. The List::Util is a core module.

There are simpler ways to print complex data structures; see core Data::Dumper.

Upvotes: 3

Polar Bear
Polar Bear

Reputation: 6818

OP does not posses full understanding of complex data structure.

Please see the code below which fills a hash with data. The data can be manipulated any imaginable way.

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $debug = 1;                          # debug flag

my %people;                             # store people's data

while(<DATA>){
    next if /^\s*$/;                    # skip empty lines
    next if /Name\,Age/;                # skip header
    s/\s+//g;                           # remove spaces
    my @data = split ',';               # obtain data
    my %param;                          # temp hash 
    @param{qw/age weight target/} = @data[1..3];
    $people{$data[0]} = \%param;        # store param hash reference
}

say Dumper(\%people) if $debug;

$~ = 'STDOUT_HEADER';
write;
$~ = 'STDOUT';

my($person,$data);

while( ($person,$data) = each %people ) {
    write;
}

$~ = 'STDOUT_FOOTER';
write;

format STDOUT_HEADER =
+--------------+-----+--------+--------+
| Name         | Age | Weight | Target |
+--------------+-----+--------+--------+
.

format STDOUT =
| @<<<<<<<<<<< | @>> |   @>>> |    @>> |
$person, $data->{age}, $data->{weight}, $data->{target}
.

format STDOUT_FOOTER =
+--------------+-----+--------+--------+
.

__DATA__
Name,Age,Weight,Target  
April,     23,    134,    90  
Jenna,     45,    156,    90  
Matt,      12,    90,     90  
Aaron,     34,    190,    90  
Daniel,    22,    188,    90

Output

$VAR1 = {
          'Daniel' => {
                        'weight' => '188',
                        'age' => '22',
                        'target' => '90'
                      },
          'April' => {
                       'target' => '90',
                       'age' => '23',
                       'weight' => '134'
                     },
          'Aaron' => {
                       'target' => '90',
                       'age' => '34',
                       'weight' => '190'
                     },
          'Matt' => {
                      'weight' => '90',
                      'age' => '12',
                      'target' => '90'
                    },
          'Jenna' => {
                       'target' => '90',
                       'age' => '45',
                       'weight' => '156'
                     }
        };


+--------------+-----+--------+--------+
| Name         | Age | Weight | Target |
+--------------+-----+--------+--------+
| Aaron        |  34 |    190 |     90 |
| Jenna        |  45 |    156 |     90 |
| Daniel       |  22 |    188 |     90 |
| Matt         |  12 |     90 |     90 |
| April        |  23 |    134 |     90 |
+--------------+-----+--------+--------+

Upvotes: 0

Izya Budman
Izya Budman

Reputation: 107

If cool printing on screen is all you need and fields in your files are the same, try this one:

#!/usr/bin/perl

use strict;
use warnings;

open(CSV, "< file.csv") or die "Can't open input file!\n";
my ($name, $age, $weight, $target);
format STDOUT =
@<<<<<<<<<@<<<<<<<<<@<<<<<<<<<@<<<<<<<<<
$name,    $age,     $weight,  $target
.
while ( my $line = <CSV> ) {
    chomp($line);
    ($name, $age, $weight, $target) = split(/,\s*/, $line);
    write;
}
close(CSV);

output:

$ ./parse_csv.pl 
Name      Age       Weight    Target
April     23        134       90
Jenna     45        156       90
Matt      12        90        90
Aaron     34        190       90
Daniel    22        188       90

Upvotes: 1

Related Questions