vetinari
vetinari

Reputation: 1

Adding each element in an array to a complex hash in Perl

I have an array with n number of elments. I want to add each of the elements to a complex hash, each as a key/value pair. If the number of elements were fixed, say three, I would do:

my %hash;
my @array = ("first", "second", "third");
$hash{$array[0]}{$array[1]}{$array[2]}++;

The structure I want to end up with, is this (printed with Data::Dumper):

$VAR1 = 'first';
$VAR2 = {
        'second' => {
                   'third' => 1
};

But I am at loss at achieving the same structure when the number of elements in the array isn't fixed. Something with anonymous variables and iterating through the array, yes, but something like

foreach @array{$hash{$_}++};
will obviously only make one entry per element, and not the desired structure. Help?

Upvotes: 0

Views: 120

Answers (2)

Borodin
Borodin

Reputation: 126762

This is relatively simple if you maintain a current hash reference. This short program demonstrates

The first few steps make sure that each hash element exists and its value is a hash reference. $href is moved to the next level of hash at each stage. For the final element of the array, the latest hash level's element is incremented instead of being set to a hash reference.

Whether or not this data structure is the correct choice depends on what else you need to do with it once you have built it

use strict;
use warnings;

my %hash;
my @array = qw/ first second third fourth fifth /;

drill_hash(\%hash, @array);

use Data::Dump;
dd \%hash;

sub drill_hash {
  my ($href, @list) = @_;
  my $final = pop @list;
  $href = $href->{$_} //= {} for @list;
  ++$href->{$final};
}

output

{
  first => { second => { third => { fourth => { fifth => 1 } } } },
}

Update

Having understood your purpose, the simplest way to keep a count of occurrences of ngrams like that is to have a speficic hash key that is used to keep the count of the sequence of words so far.

This program uses the value _COUNT for that key, and you can see that, for example, {under}{a}{_COUNT} and {under}{a}{rock}{_COUNT} are both 1

use strict;
use warnings;

my %counts;

count_ngram(\%counts, qw/ under a /);
count_ngram(\%counts, qw/ a rock /);
count_ngram(\%counts, qw/ under a rock /);
count_ngram(\%counts, qw/ a tree /);
count_ngram(\%counts, qw/ under a tree /);

use Data::Dump;
dd \%counts;

sub count_ngram {
  my ($href, @ngram) = @_;
  my $final = pop @ngram;
  $href = $href->{$_} //= {} for @ngram;
  ++$href->{$final}{_COUNT};
}

output

{
  a => { rock => { _COUNT => 1 }, tree => { _COUNT => 1 } },
  under => {
    a => { _COUNT => 1, rock => { _COUNT => 1 }, tree => { _COUNT => 1 } },
  },
}

Upvotes: 3

Hunter McMillen
Hunter McMillen

Reputation: 61550

Something like this could build the structure you desire for N elements:

use strict;
use warnings; 

use Data::Dumper;
my @array = qw(first second third four five six seven);

my $hash;
foreach my $key ( reverse @array ) {
   $hash = { $key => $hash };
}
print Dumper $hash;

__END__
$VAR1 = {
      'first' => {
                   'second' => {
                                 'third' => {
                                              'fourth' => {
                                                          'fifth' => {
                                                                      'sixth' => {
                                                                                 'seventh' => undef
                                                                               }
                                                                    }
                                                        }
                                            }
                               }
                 }
    };

It is not clear what you really need this for. There may be a better solution if you explain your use-case a little more. Incrementing this structure doesn't appear to be very easy.

After playing around a little, you can increment by traversing the hash reference to the bottom then incrementing the value of the last element. It is not very pretty though :|

# incrementing
my $elem = $hash; # copy the reference 
foreach my $key ( @array ) {
   # found the bottom of the hash
   unless ( $elem->{$key} && ref($elem->{$key}) ) { 
      $elem->{$key}++;
      last; 
   }

   # not at the bottom, move to the next level
   $elem = $elem->{$key};
}
print Dumper $hash;

__END__
$VAR1 = {
      'first' => {
                   'second' => {
                                 'third' => {
                                              'fourth' => {
                                                            'fifth' => {
                                                                         'sixth' => {
                                                                                      'seventh' => 1
                                                                                    }
                                                                       }
                                                          }
                                            }
                               }
                 }
    };

Upvotes: 4

Related Questions