drvtiny
drvtiny

Reputation: 715

Is there a way to implement statically-defined associative arrays in Perl 5?

If I have some hash with the (absolutely) static set of keys, can I avoid calculating of hash functions on each access to elements of this hash by key at run time?

Say, I have a static associative array %saa with the keys:

my %saa = (
 A => "aaa",
 B => "bbb",
 C => "ccc"
);

How to pre-calculate hash function values for these keys in compile time and use it in run time in such effective manner as I did when using indexed access to simple lists elements.

As I know for now, in perl5 it may be implemented only as emulation of hashes using lists (use constant A=>0, B=>1...; $saa[A]="ddd").

Maybe there is more straightforward way to implement static associative arrays (also called "named tuples", not hashes) in Perl ?

For example, that's how the desired feature was implemented in Crystal language.

Upvotes: 1

Views: 448

Answers (3)

amon
amon

Reputation: 57640

Perl already precomputes the hash of constant keys. The hash code is then retrieved at runtime. Additionally, hash keys are constant-pooled so that even if the hash collides, the string equality is simplified to a pointer comparison.

Therefore, the extra runtime cost of of a constant hash key lookup compared to a constant array index lookup is very low, especially compared to all the overhead that is incurred anyway due to the Perl data model.

How does this work?

During the optimization phase, the S_maybe_multideref() function in op.c folds consecutive dereference operations like $x{foo}[0]{bar} into a single opcode. Scalars for constant keys are set aside so that they are just a pointer lookup away. These scalars are prepared by the S_check_hash_fields_and_hekify() function, which creates the constant-pooled scalar using the newSVpvn_share() function which (after a few levels of macros) uses the S_share_hek_flags() function from hv.c which creates or fetches a hash entry key (HEK) structure in a global constant pool (which itself is a hash table).

During runtime, the hash lookup function notices that the key scalar is in fact a shared HEK, and directly retrieves the hash. This short-circuits most of the hash table lookup code in the Perl_hv_common() function.

This only works when using constant strings for hash keys. Otherwise, you still have to do the complete hash lookup.

Using constant array keys like $struct[KEY_FOO] in place of constant hash keys $hash{foo} is therefore not going to give you a noticeable performance benefit. Instead, the main benefit of the array approach is that you have to declare all constants up front, thus giving you some amount of static checking. This can e.g. detect typos. Note that hashes also support some checks:

  • Using the fields pragma, you can statically restrict the hash keys that can be accessed through a variable. E.g. my SomeClass $object only allows $object->{foo} if the SomeClass class has declared a foo field. However, this is a very arcane, misleading, and discouraged feature of Perl. Please do not use it.

  • Using “restricted hashes”, you can freeze the keys of a hash table (see lock_keys() in Hash::Util). This is a runtime check. This is the best approach if you want to get the typo-detection benefits. E.g. we could declare a kind of struct like:

    use Hash::Util qw(lock_keys);
    my %struct;
    lock_keys(%struct, qw(foo bar));
    
    $struct{foo};  # OK
    $struct{bar};  # OK
    $struct{qux};  # runtime error
    

Upvotes: 3

Sinan Ünür
Sinan Ünür

Reputation: 118148

The closest thing to the functionality described in struct NamedTuple(**T) is Const::Fast.

#!/usr/bin/env perl

use strict;
use warnings;

use Const::Fast qw( const );

my %language;

BEGIN {
    const %language => (
        name =>'Crystal',
        year => '2011',
    );
}

print "$language{$_}\n" for qw( name year other );
$ perl t.pl
Crystal
2011
Attempt to access disallowed key 'other' in a restricted hash at t.pl line 17.

BUT the check happens at run time. See Neil Bowers' excellent review for performance information. The stats may have changed since then, but it should give you an idea.

An alternative is not to use hashes at all, but use Class::XSAccessor to define a simple class:

#!/usr/bin/env perl

use strict;
use warnings;

package My::Language;
use Class::XSAccessor (
    constructor => 'new',
    getters => {
        name => 'name',
        year => 'year',
    },
);

package main;

my $language;

BEGIN {
    $language = My::Language->new(
        name =>'Crystal',
        year => '2011',
    );
}

print $language->$_, "\n" for qw( name year other );

Class::XSAccessor::Array should be about 10%-15% faster:

#!/usr/bin/env perl

use strict;
use warnings;

package My::Language;
use Class::XSAccessor::Array (
    getters => {
        name => 0,
        year => 1,
    },
);

package main;

my $language;

BEGIN {
    $language = bless ['Crystal', '2011'] => 'My::Language';
}

print $language->$_, "\n" for qw( name year other );

In all cases, the checks happen at run time.

Upvotes: 4

mob
mob

Reputation: 118635

Since this is Perl, there are a number of ways to at least sort of do what you want. Here is a tied hash implementation that does not do any hash lookups:

sub DrvtinyStaticHash::TIEHASH {
    my ($pkg, $A, $B, $C) = @_;
    bless [ $A, $B, $C ], $pkg;
}

sub DrvtinyStaticHash::STORE {
    my ($self,$key,$value) = @_;
    if ($key eq 'A')      { $self->[0] = $value }
    elsif ($key eq 'B') { $self->[1] = $value }
    elsif ($key eq 'C') { $self->[2] = $value; }
    else { die "DrvtinyStaticHash: Bad key: $key" }
}

sub DrvtinyStaticHash::FETCH {
    my ($self,$key) = @_;
    return $self->[0] if $key eq 'A';
    return $self->[1] if $key eq 'B';
    return $self->[2] if $key eq 'C';
    die "DrvtinyStaticHash: Bad key: $key";
}

tie my %saa, 'DrvtinyStaticHash', 'aaa', 'bbb', 'ccc';

print $saa{"B"};     # "bbb"
$saa{"C"} = 42;
print $saa{"D"};     # error: Bad key

You could generalize this to any package name and any set of keys with an unhealthy dose of eval calls.

Upvotes: 2

Related Questions