Eugen Konkov
Eugen Konkov

Reputation: 25207

How to use pseudo hash in modern perl?

I have data loaded in next format (probably loaded from .csv file):

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
]

I want to access data like:

$data->[$n]{ name }

I know that in old perl I could use phash (pseudo hash), but it was deprecated and replaced by fields pragma.

As far as I can see that is used for objects. In my case I do not create objects and do not use classes.

How should I use fields in my use case? Please provide an example

Upvotes: 2

Views: 294

Answers (3)

brian d foy
brian d foy

Reputation: 132858

If you can use Text::CSV_XS as Shawn shows, do that.

We have an exercise in Intermediate Perl that does this. Use the first row to map the header names to position, then use that hash to translate the other way. Here it is with heavy use of postfix dereferencing:

use v5.24;

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
];

# ikegami's suggestion
my %name_to_index = map { $data->[0][$i] => $_ } 0..$data->[0]->$#*;

foreach my $i ( 1 .. $data->$#* ) {
    say $data->[$i][ $name_to_index{name} ]
    }

Here's the circumfix version that works with versions before v5.24, but I think it's uglier (and you have to use a six year old, unsupported Perl):

use v5.10;

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
];

# ikegami's suggestion
my %name_to_index = map { $data->[0][$i] => $_ } 0.. $#{ $data->[0] };

foreach my $i ( 1 .. $#{ $data } ) {
    say $data->[$i][ $name_to_index{name} ]
    }

Since the real code is probably much more complex, I think it's often easier to understand when you don't drill down into the data structure everywhere. If you don't mind the extra work (as you would in a hot loop), you can make your row into a hash where the keys are the headers (similar to what Text::CSV_XS does) then play with that hash without thinking about the entire chain of dereferences. This example uses a hash slice to populate everything at once. After that you play with %hash instead of $data->[$i][...]:

use v5.24;

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
];

my @headers = $data->[0]->@*;
foreach my $i ( 1 .. $data->$#* ) {
    my %hash;
    @hash{ @headers } = $data->[$i]->@*;

    say $hash{name};
    }

Curiously, right after the Pseudohash section in perlref, the docs show an example with function templates. Instead of the hash to do the mapping, you can define subroutines. Some people like that the name of the header index looks a little cleaner, but I don't think this is worth the extra explanation of the soft refs violation and the explanation of type globs:

use v5.24;

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
];

foreach my $name ( $data->[0]->@* ) {
    state $n = 0;
    my $m = $n++;      # don't reference $n
    no strict 'refs';  # Hey there!
    *{uc $name} = sub () { $m }; # runtime sub definition
    }

foreach my $i ( 1 .. $data->$#* ) {
    say $data->[$i][ NAME() ]
    }

Upvotes: 4

Polar Bear
Polar Bear

Reputation: 6798

The $data require some conversion into array of hashes.

First row of $data is a fields name, it will be used as hash keys.

Next we need:

walk through other rows of array

  • create hash from keys and data

  • push hash into new array

when all data processed

return $new reference to final array

At this point elements of $new array reference can be accessed as $new->[0]{id}.

Please see the following code demonstrating how desired data storage can be achieved.

NOTE: this code does not rely on new perl features and should produce desired result on systems even 20 years old (where perl update impossible for valid reason).

Last for loop demonstrates printing all elements of $new array reference.

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $debug = 0;                             # debug = 1 - Debug mode ON

my $data =  [
   [ 'id', 'name', 'value' ],
   [   23,  'foo',      77 ],
   [   44,  'bar',   'dfd' ],
];

my $new = convert($data);

say Dumper($new) if $debug;

for ( 0..$#{$new} ) {                      # walk through result array
    say '-' x 35;
    say 'Id:    ' . $new->[$_]{id};
    say 'Name:  ' . $new->[$_]{name};
    say 'Value: ' . $new->[$_]{value};
    say '-' x 35;
}

sub convert {
    my $data = shift;
    
    my @fields = @{$data->[0]};            # used as hash key
    
    my @data;

    for ( 1..$#{$data} ) {                # walk through $data starting from index=1
        my %hash;
        @hash{@fields} = @{$data->[$_]};  # store data in hash
        push @data, \%hash;               # store hash into array
    }
    
    return \@data;                        # return reference to array of hashes
}

Output

-----------------------------------
Id:    23
Name:  foo
Value: 77
-----------------------------------
-----------------------------------
Id:    44
Name:  bar
Value: dfd
-----------------------------------

Upvotes: 0

Shawn
Shawn

Reputation: 52529

Using the Text::CSV_XS module to read your CSV data, and telling it what the column names are based on the first line:

#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use Text::CSV_XS;

my $csv = Text::CSV_XS->new({binary => 1, auto_diag => 1});
$csv->column_names($csv->getline(\*DATA));
my $data = $csv->getline_hr_all(\*DATA);
say $data->[0]->{'name'}; # prints foo

__DATA__
id,name,value
23,foo,77
44,bar,dfd

Consider looping over the records instead of reading the entire file at once, though. See the documentation for getline_hr for a couple of ways to do it.

Upvotes: 6

Related Questions