syker
syker

Reputation: 11272

Sorting an Array Reference to Hashes

After executing these lines in Perl:

my $data = `curl '$url'`;
my $pets = XMLin($data)->(pets);

I have an array reference that contains references to hashes:

$VAR1 = [
      {
        'title' => 'cat',
        'count' => '210'
      },
      {
        'title' => 'dog',
        'count' => '210'
      }
]

In Perl, how do I sort the hashes first by count and secondarily by title. Then print to STDOUT the count followed by the title on each newline.

Upvotes: 9

Views: 2096

Answers (1)

Greg Bacon
Greg Bacon

Reputation: 139501

Assuming you want counts in descending order and titles ascending:

print map join(" ", @$_{qw/ count title /}) . "\n",
      sort { $b->{count} <=> $a->{count}
                         ||
             $a->{title} cmp $b->{title} }
      @$pets;

That's compact code written in a functional style. To help understand it, let's look at equivalent code in a more familiar, imperative style.

Perl's sort operator takes an optional SUBNAME parameter that allows you to factor out your comparison and give it a name that describes what it does. When I do this, I like to begin the sub's name with by_ to make sort by_... ready more naturally.

To start, you might have written

sub by_count_then_title {
  $b->{count} <=> $a->{count}
              ||
  $a->{title} cmp $b->{title}
}

my @sorted = sort by_count_then_title @$pets;

Note that no comma follows the SUBNAME in this form!

To address another commenter's question, you could use or rather than || in by_count_then_title if you find it more readable. Both <=> and cmp have higher precedence (which you might think of as binding more tightly) than || and or, so it's strictly a matter of style.

To print the sorted array, a more familiar choice might be

foreach my $p (@sorted) {
  print "$p->{count} $p->{title}\n";
}

Perl uses $_ if you don't specify the variable that gets each value, so the following has the same meaning:

for (@sorted) {
  print "$_->{count} $_->{title}\n";
}

The for and foreach keywords are synonyms, but I find that the uses above, i.e., foreach if I'm going to name a variable or for otherwise, read most naturally.

Using map, a close cousin of foreach, instead isn't much different:

map print("$_->{count} $_->{title}\n"), @sorted;

You could also promote print through the map:

print map "$_->{count} $_->{title}\n",
      @sorted;

Finally, to avoid repetition of $_->{...}, the hash slice @$_{"count", "title"} gives us the values associated with count and title in the loop's current record. Having the values, we need to join them with a single space and append a newline to the result, so

print map join(" ", @$_{qw/ count title /}) . "\n",
      @sorted;

Remember that qw// is shorthand for writing a list of strings. As this example shows, read a map expression back-to-front (or bottom-to-top the way I indented it): first sort the records, then format them, then print them.

You could eliminate the temporary @sorted but call the named comparison:

print map join(" ", @$_{qw/ count title /}) . "\n",
      sort by_count_then_title
      @$pets;

If the application of join is just too verbose for your taste, then

print map "@$_{qw/ count title /}\n",
      sort by_count_then_title
      @$pets;

Upvotes: 9

Related Questions