Does merging 2 or more Perl hash references consume more or less twice the memory?

Question

Given the following code, does the hash referenced by $z consume the same memory as that used by ( %$x, %$y), more or less?

If so, is there a way to use a single reference to call data from the hashes referenced by either $x or $y like $z->{$somekeytoXorY} without affecting performance and memory?

use strict;
use warnings;

my $x = {
    1 => 'a',
    2 => 'b',
};

my $y = {
    3 => 'c',
    4 => 'd',
};

my $z = {
    %$x, %$y
};

Update

The hash references actually point to large hashes created using tie and DB_File.

I was wondering whether there is a chance I could use just a single hash to these so that I don't need to dump everything in memory. Also I may be using more than two of these at once.

ikegami · Accepted Answer

Tied hashes aren't hashes at all. They are interfaces to subroutines. Since they're code rather than data, talking about the memory and performance of tied hashes in general makes no sense.

Let's talk about ordinary hashes first.

$z = { %$x, %$y }; will copy the scalars of %$x and %$y into %$z, so yes, it will take twice the memory (assuming no duplicate keys).

You could share the scalars:

use Data::Alias qw( alias );
my $z = {};
alias $z->{$_} = $x->{$_} for keys(%$x);
alias $z->{$_} = $y->{$_} for keys(%$y);

You'd still use memory proportional to the number of the elements in both hash, but it would be far less than before if %$x and %$y are actually hashes. This might not save any memory for tied hashes.

The alternative is not to actually merge the data at all. You could use a tied hash yourself...

package Tie::MergedHashes;
use Carp qw( croak );
sub new     { my $pkg = shift; $pkg->TIEHASH(@_); }
sub TIEHASH { bless [ @_ ], $_[0] }
sub STORE   { croak("Not allowed"); }
sub FETCH   { for (@{$_[0]}) { return $_->{$_[1]} if exists($_->{$_[1]}); } return undef; }
...

my $z = {};
tie %$z, MergedHashes => ($y, $x);
$z->{$key}

...but there's no reason to make the code look like hash. You could simply use an object.

package MergedHashes;
use Carp qw( croak );
sub new   { bless [ @_ ], $_[0] }
sub fetch { for (@{$_[0]}) { return $_->{$_[1]} if exists($_->{$_[1]}); } return undef; }
...

my $z = MergedHashes->new($y, $x);
$z->fetch($key)

Does merging 2 or more Perl hash references consume more or less twice the memory?

Update

Answers (2)

Related Questions