Sebastian Hoffmann
Sebastian Hoffmann

Reputation: 105

Problems with sorting a hash of hashes by value in Perl

I'm rather inexperienced with hashes of hashes - so I hope someone can help a newbie... I have the following multi-level hash:

$OCRsimilar{$ifocus}{$theWord}{"form"} = $theWord;
$OCRsimilar{$ifocus}{$theWord}{"score"} = $OCRscore;
$OCRsimilar{$ifocus}{$theWord}{"distance"} = $distance;
$OCRsimilar{$ifocus}{$theWord}{"similarity"} = $similarity;
$OCRsimilar{$ifocus}{$theWord}{"length"} = $ilength;
$OCRsimilar{$ifocus}{$theWord}{"frequency"} = $OCRHashDict{$ikey}{$theWord};

Later, I need to sort each second-level element ($theWord) according to the score value. I've tried various things, but have failed so far. The problem seems to be that the sorting introduces new empty elements in the hash that mess things up. What I have done (for example - I'm sure this is far from ideal):

my @flat = ();
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
    push @flat, [$key1, $OCRsimilar{$ifocus}{$key1}{'score'}];
}

for my $entry (sort { $b->[1] <=> $a->[1] } @flat) {
    print STDERR "@$entry[0]\t@$entry[1]\n";
}

If I check things with Data::Dumper, the hash contains for example this:

  'uroadcast' => {
                 'HASH(0x7f9739202b08)' => {},
                 'broadcast' => {
                                'frequency' => '44',
                                'length' => 9,
                                'score' => '26.4893274374278',
                                'form' => 'broadcast',
                                'distance' => 1,
                                'similarity' => 1
                              }
               }

If I don't do the sorting, the hash is fine. What's going on? Thanks in advance for any kind of pointers...!

Upvotes: 0

Views: 113

Answers (2)

Moh
Moh

Reputation: 304

What seems suspicious to me is this construct:

foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {

Try dereferencing the hash, so it becomes:

foreach my $key1 (keys %{ $OCRsimilar{$ifocus} }) {

Otherwise, you seem to be creating an anonymous hash and taking the keys of it, equivalent to this code:

foreach my $key1 (keys { $OCRsimilar{$ifocus} => undef }) {

Thus, I think $key1 would equal $OCRsimilar{$ifocus} inside the loop. Then, I think Perl will do auto-vivification when it encounters $OCRsimilar{$ifocus}{$key1}, adding the hash reference $OCRsimilar{$ifocus} as a new key to itself.

If you use warnings;, the program ought to complain Odd number of elements in anonymous hash.

Still, I don't understand why Perl doesn't do further auto-vivication and add 'score' as the key, showing something like 'HASH(0x7f9739202b08)' => { 'score' => undef }, in the Data dump.

Upvotes: 0

choroba
choroba

Reputation: 241838

Just tell sort what to sort on. No other tricks are needed.

#!/usr/bin/perl
use warnings;
use strict;

my %OCRsimilar = (
                  focus => {
                            word => {
                                     form       => 'word',
                                     score      => .2,
                                     distance   => 1,
                                     similarity => 1,
                                     length     => 4,
                                     frequency  => 22,
                                    },
                            another => {
                                        form       => 'another',
                                        score      => .01,
                                        distance   => 1,
                                        similarity => 1,
                                        length     => 7,
                                        frequency  => 3,
                                       },
                           });

for my $word (sort { $OCRsimilar{focus}{$a}{score} <=> $OCRsimilar{focus}{$b}{score} }
                   keys %{ $OCRsimilar{focus} }
             ) {
    print "$word: $OCRsimilar{focus}{$word}{score}\n";
}

Pointers: perlreftut, perlref, sort.

Upvotes: 1

Related Questions