Rijumone
Rijumone

Reputation: 822

Confusing "sort" syntax in perl

I started learning Perl today and found this very interesting tutorial. However, for the life of me I could not wrap my head around this code snippet.

print " $_: $created{$_}" for(sort({$created{$b} <=> $created{$a}} keys %created));

What kind of sort is this? More specifically, what are the variables $a and $b? I tried going through the documentation, but can't say that it helped. Any support would be greatly appreciated.

Upvotes: 1

Views: 532

Answers (3)

elcaro
elcaro

Reputation: 2297

There's a few things going on here. First I'll address the postfix loop;

say $_ for ('foo', 'bar', 'baz');

Which is semantically the same as doing this

for ('foo', 'bar', 'baz') { say $_ }

Now on to the sort. By default, Perl's sort sorts alphabetically

sort (20, 3, 100);  # RESULT: (100, 20, 3)

Perl provides a way to explicitly compare alphabetically (cmp) or numerically (<=>).

20 cmp 3;  # RESULT: -1 (means: Less)
20 <=> 3;  # RESULT:  1 (means: More)
20 <=> 20; # RESULT:  0 (means: Same)

So you can use these operators inside a sort block, but what do you put either side of the operators? This is where the special $a and $b variables come in.

sort { $a <=> $b } (20, 3, 100);  # RESULT: (3, 20, 100)

This is also why you should never manually assign values to $a or $b because if you do, sort will not function correctly!

So put it all together, and your sort is sorting a Hash's keys by their corresponding values numerically

my %hash = ( twenty => 20, three => 3, onehundred => 100 ); 

for ( sort { $hash{$a} <=> $hash{$b} } keys %hash ) {
    say "$_: %hash{$_}"
}

Outputs

three: 3
twenty: 20
onehundred: 100

You can also do { $b <=> $a } to sort in reverse.

If this all looks arcane and confusing, it kinda is. I'd recommend using a module like Sort::Key for most of your sorting needs... or maybe Sort::Naturally if that's more what you're after.

Upvotes: 9

simbabque
simbabque

Reputation: 54323

print " $_: $created{$_}" 
    for ( sort { created{$b} <=> $created{$a} } keys %created );

This uses the post-fix form of for. In Perl, for and foreach are the same. Both can be used with the C-style for syntax as well as the shorter foreach style syntax.

We can rewrite this loop to the following more verbose version.

foreach ( sort { created{$b} <=> $created{$a} } keys %created ) {
    print " $_: $created{$_}";
}

Next, let's look at $_. It's called the topic and is used as the default input for a lot of things in Perl if no other variable is given. In a foreach loop, if no iteration variable is used, every element of the list in the parenthesis ends up in $_. Again, this could be rewritten for clarity.

foreach my $elem ( sort { created{$b} <=> $created{$a} } keys %created ) {
    print " $elem: $created{$elem}";
}

Now let's look at the sort. The two variables $a and $b are global variables that are always there in Perl. They are reserved for the block that follows the sort statement.

sort { BLOCK } LIST

$a and $b contain two values from the list, that are being compared in the block. The block behaves like an anonymous function (called lambda in other languages), and there is no comma after it. Alternatively, you can put the name of a named sub there as a bareword without quotes. Again, there is no comma.

sub bubbly_sort { $a <=> $b }
sort bubbly_sort 3, 2, 1;

The <=> operator does numerical comparison. It compares both values and returns -1, 0, or 1.

sort { $b <=> $a } LIST

This will sort the values in the LIST numerically descending, so the highest value comes first. $a <=> $b would be ascending.

sort { created{$b} <=> $created{$a} } keys %created

The keys keyword gives you an unordered list of the keys inside the hash %created. The block in your sort compares the values behind those keys in the %created hash and sorts the keys descending. So you get out a list of sorted keys.

The rest of the code just prints out a key/value list of that sorted list of keys.

It's notable that the output is all on one line, because there is no newline after the print.


The condensed version you posted can be more readable than the long version I showed in this answer. But it stands and falls with proper indentation and use of parentheses.

print " $_: $created{$_}" 
    for sort { created{$b} <=> $created{$a} } keys %created;

versus

print " $_: $created{$_}" for(sort({$created{$b} <=> $created{$a}} keys %created));

versus

foreach ( sort { created{$b} <=> $created{$a} } keys %created ) {
    print " $_: $created{$_}";
}

In my opinion, the first version reads the most like an English sentence, which is good. Clean, readable code should always be the first priority of a good developer.

Upvotes: 6

Quentin
Quentin

Reputation: 943207

Just break it down.

This is the sort bit:

sort({$created{$b} <=> $created{$a}} keys %created)

The parenthesis aren't relevant here.

sort { $created{$b} <=> $created{$a} } keys %created

So this is a very standard sort.

sort is sort.

The first argument is a block…

If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than 0 , depending on how the elements of the list are to be ordered. (The <=> and cmp operators are extremely useful in such routines.) SUBNAME may be a scalar variable name (unsubscripted), in which case the value provides the name of (or a reference to) the actual subroutine to use. In place of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.

The second argument is the thing being sorted (keys %created). i.e. a list of the keys in the hash.

In the sub, $a and $b are the elements being sorted. So $created{$b} <=> $created{$a} takes the keys and compares the values associated with them in the hash.

So the sort outputs a list of keys in the hash, sorted by the values associated with them.

That then gets passed to a postfix for, so for each of the keys it does something (in the order of the values).

The something is print " $_: $created{$_}": So it prints the key, then a colon, then the value.

Upvotes: 4

Related Questions