perldata-structureshashtableassociative-arrayperl-data-structures

Reputation: 47

Convert hash values with same key to hash of arrays in Perl

I need to convert the hash into hash of array in perl

I have:

%hash = (
    tinku => 15,
    tina  => 4,
    rita  => 18,
    tinku => 18,
    tinku => 17,
    tinku => 16,
    rita  => 19
);

And I want to change it to:

%hash =  ( tinku => [ 15, 16, 17, 18 ], rita => [ 18, 19 ], tina => 4 );

Upvotes: 1

Answers (5)

G. Cito

Reputation: 6378

The techniques and patterns covered in the other responses here are tried and true idioms that are essential for getting the most out of Perl, for understanding existing code, and for working with the large installed base of older perl compilers. Just for fun I thought I mention a couple of other approaches:

There's a fairly readable new syntax in perl-5.22 that is an alternative to the more classic approach take by @fugu. For something a bit more funky I'll mention @miyagawa's Hash::MultiValue. Perl 6 also has a nice way to convert lists of key/value pairs with potentially non-unique keys into hashes containing keys with multiple values.

As the other responses here point out, the "key" to all of this is:

For a hash key to refer to multiple values, the values need to be not just a list or array but a anonymous array [ ] or a reference.

Using new syntax available with `perl-5.22`

Fugu's response shows the standard Perl idiom. Iterating through @names using for 0 .. $#names ensures that overlapping keys are not "lost" and instead point at an anonymous array of multiple values. With perl-5.22 we can use the pairs() function from List::Util (a core module) and postfix dereferencing to add key/value pairs to a hash and account for overlapping or duplicate keys in a slightly different way:

use experimental qw(postderef);
use List::Util qw/pairs/;

my %hash;    
my $a_ref = [ qw/tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19/ ];
push $hash{$_->key}->@* , $_->value for pairs @$a_ref;

use DDP;
p %hash;

As of version 1.39 List::Util::pairs() returns ARRAY references as blessed objects accessible via ->key and ->value methods. The example uses LEONT's experimental.pm pragma and DDP to make things a bit more compact.

Output:

{
    rita    [
        [0] 18,
        [1] 19
    ],
    tina    [
        [0] 4
    ],
    tinku   [
        [0] 15,
        [1] 18,
        [2] 17,
        [3] 16
    ]
}

As to which is more "readable": it's hard to beat the easily "grokable" standard approach, but with the new syntax available in the latest versions of perl5 we can explore the potential of new idioms. I am really starting to like postfix dereferencing. TIMTOWTDI and beyond!

@miyagawa's `Hash::MultiValue`

With this module all you can create a Hash::MultiValue object (with lots of methods to access it in various ways) and a plain hash reference to conveniently work with multiple values per key.

#!/usr/bin/env perl -l
use Hash::MultiValue;
use strict;
use warnings;

my $mvhash = Hash::MultiValue->new(tinku =>15, tina =>4, rita =>18,
                tinku =>18, tinku =>17, tinku =>16, rita =>19);

print "\ntinku's values:\n", join " ", $mvhash->get_all('tinku');

print "\nflattened mvhash:\n", join " ", $mvhash->flatten ;

print "\n ... using mvhash as a hashref:" ;
print join " ", $mvhash->get_all($_) for keys %$mvhash ;

print "\n", '... as a "mixed" hashref with each():';
my $mvhash_ref = $mvhash->mixed ;

while ( my ($k, $v) = each $mvhash_ref ) { 
  print "$k => " , ref $v eq "ARRAY" ? "@{$v}" : "$v" ; 
}

Output:

tinku's values:
15 18 17 16

flattened mvhash:
tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19

... using mvhash as a hashref:
15 18 17 16
18 19
4

... as a "mixed" hashref with each():
tinku => 15 18 17 16
rita => 18 19
tina => 4

Once your hash is available as a Hash::MultiValue object you can manipulate it in various ways to quickly create temporary copies and hash references. Just assign them to a scalar and Dump the (or use DDP) to get an idea of how it works:

use DDP; 
my $hmulti = $mvhash->multi; p $hmulti ;
my $hmixed = $mvhash->mixed; p $hmixed

There's some restrictions on using regular hash operations with a Hash::MultiValue object (and things like dd \$mvhash are not going to show you the whole hash - you need to do dd $hash->multi) however in some situations there is an advantage to working with multi-value hashes in this way (i.e. more readable and/or possibly less code needed for some functions).

You still need to recognize when/where Hash::MultiValue is useful so it's not unambiguously "easier" or "cleaner" - but it's another useful addition to your box of perl tools.

Perl 6 - just for comparison

Perl6 can be a bit more compact for grabbing key/value pairs from a list because you can use "multiple parameters" in a for statement, traversing a list by groups of elements then using push to arrange them into a hash. You can do this in a way that "automagically" accounts for overlapping keys. cf. this short perl6 snippet:

my %h ;
for <tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19> -> $k, $v { 
    %h.push($k => $v) ;
}
%h.perl.say ;

Edit: The friendly folks on #perl6 suggest an even more succinct "method":

my %h.push: <tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19>.pairup ;
%h.perl.say ;

Output:

{:rita(["18", "19"]), :tina("4"), :tinku(["15", "18", "17", "16"])}<>

cf.

It's not just continued development of perl the compiler that makes it possible to write Perl code in new and interesting ways. Thanks to @miygawa and Paul Evans for his stewardship of Scalar-List-Utils you can do cool things with Hash::MultiValue even if your version of perl is as old as version 5.8; and you can try the functions available in latest versions of List::Util even if your perl is barely from this millennium (List::Util works with perl-5.6 which ushered in the 21st century in March 2000).

Upvotes: 5

mpapec

Reputation: 50637

Since a hash can only have unique keys, don't assign list to a hash, but process it with pairs() from List::Util,

use List::Util 'pairs';

my %hash;
push @{ $hash{$_->[0]} }, $_->[1]
 for pairs (tinku =>15,tina =>4, rita =>18, tinku =>18, 
           tinku =>17, tinku =>16, rita =>19);

use Data::Dumper; print Dumper \%hash;

output

$VAR1 = {
      'tinku' => [
                   15,
                   18,
                   17,
                   16
                 ],
      'rita' => [
                  18,
                  19
                ],
      'tina' => [
                  4
                ]
    };

Upvotes: 2

fugu

Reputation: 6578

You're asking for the impossible! Hashes can only have unique keys, so in your example you will produce a hash which takes each unique name as its key, and the last value for each key as its value:

#!/usr/bin/perl
use warnings;
use strict; 
use Data::Dumper;

my %hash = (tinku =>15,tina =>4, rita =>18, 
           tinku =>18, tinku =>17, tinku =>16, rita =>19);

print Dumper \%hash;

$VAR1 = {
          'rita' => 19,
          'tina' => 4,
          'tinku' => 16
        };

To make a hash of arrays you could try something like this:

my %hash;

my @names = qw(tinku tina rita tinku tinku tinku rita);
my @nums = qw(15 4 18 18 17 16 19);


push @{ $hash{ $names[$_] } }, $nums[$_] for 0 .. $#names;


print Dumper \%hash;

$VAR1 = {
          'rita' => [
                      '18',
                      '19'
                    ],
          'tina' => [
                      '4'
                    ],
          'tinku' => [
                       '15',
                       '18',
                       '17',
                       '16'
                     ]
        };

Upvotes: 3

Dallaylaen

Reputation: 5308

my %hash = (tinku =>15,tina =>4, rita =>18, 
    tinku =>18, tinku =>17, tinku =>16, rita =>19);

This assignment is going to only keep the last value for each key (i.e. tinku=>16, rita=>19, tina=>4) and dismiss the previous ones. This is done so deliberately to allow overriding values in hash assignments. E.g.

sub some_function {
     my %args = (%sane_defaults, @_);
};

Also, (foo => (1, 2, 3)) would create hash (foo => 1, 2 => 3) and not what you expect.

A possible solution could be:

use strict;
use warnings;
use Data::Dumper;

my @array = (tinku =>15,tina =>4, rita =>18, tinku =>18, 
     tinku =>17, tinku =>16, rita =>19);
my %hash = hash_of_arrays( @array );
print Dumper(\%hash);

sub hash_of_arrays {
     die "Odd number of elements in hash (of arrays) assignment"
          if @_ % 2;
     # I never understood why this is a *warning* :-)

     # populate hash by hand
     my %hash; 
     while (@_) {
          my $key = shift;
          my $value = shift;
          push @{ $hash{$key} }, $value;
          # here hash values automatically become 
          # empty arrayrefs if not defined, thanks Larry
     };
     return %hash; 
     # *tecnically*, this one returns *array* 
     # and converts it back to hash
};

Upvotes: 5

Chankey Pathak

Reputation: 21666

You can't have that hash in the first place. A hash in Perl must have unique keys.

Upvotes: 2

Convert hash values with same key to hash of arrays in Perl

Answers (5)

Using new syntax available with perl-5.22

@miyagawa's Hash::MultiValue

Perl 6 - just for comparison

Related Questions

Using new syntax available with `perl-5.22`

@miyagawa's `Hash::MultiValue`