chrys
chrys

Reputation: 85

Sorting mixed array in perl

I have a few keys of a hash which look like the following:

Test21
Test1
Test4
Test2
Test13
TestA
TestB

I tried several approaches to sort them using either the inbuilt sort function or extra subroutines but I just do not seem to get it right.

My desired output would be:

Test1
Test2
Test4
Test13
Test21
TestA
TestB

One of my approaches looked like this:

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper qw(Dumper);


my % hash = (Test1 => "Hello",Test21 => "Somedata", Test4 => "SomeMoreData",Test2 => "EvenMore",Test13 => "AlotMore",TestA => "Nope", TestB => "EvenMoreNope");

foreach my $keys(sort byNumberandAlpha keys %hash){
    print "$keys\n";
}


sub byNumberandAlpha{

    my @temp_a = split("Test",$hash{$a});
    my $element_a = $temp_a[1];

    my @temp_b = split("Test",$hash{$b});
    my $element_b = $temp_b[1];

    if ( $element_a  =~ /[0-9]/ && $element_b =~ /[0-9]/ ) {

        $a <=> $b;

    }else{

        $a cmp $b;
    }
}

OUTPUT:

Use of uninitialized value $element_a in pattern match (m//) at ExpirimentalSorting.pl line 23.
Test1
Test13
Test2
Test21
Test4
TestA
TestB

Any help on getting this figured out is much very much appreciated.

Upvotes: 0

Views: 255

Answers (1)

Sobrique
Sobrique

Reputation: 53478

The thing with sort is you can sort by anything you like, you just need to ensure you return the right values based on inserted comparison.

So in your case - it appears you're sorting on 'the bit which isn't test' and comparing numerically first, and alphabetically second.

What you're doing though, is looking up your hash keys with:

my @temp_a = split("Test",$hash{$a});

And that ... doesn't actually work, because in none of your examples does $hash{$a} include the word 'test'.

So I think you're misunderstanding something profound.

I think you want:

sub my_sort {
   my ($a1) = $a =~ m/Test(\w+)/;
   my ($b1) = $b =~ m/Test(\w+)/;

   if ( $a1 =~ /\d/ and $b1 =~ /\d/ ) {
      return $a1 <=> $b1;
   }
   else {
      return $a1 cmp $b1;
   }
}

However, you may find it simpler still to use Sort::Naturally

foreach my $keys ( nsort keys %hash ) {
   print "$keys\n";
}

(Although that does sort TestA above Test1).

You could do some magic using dualvar but that's a bit of a can of worms. For the sake of curiosity though:

use Scalar::Util qw ( dualvar ); 
sub my_sort {
   $_ = dualvar ( s/\D+//r || 999999, $_ ) for $a, $b; 
   return ( $a <=> $b 
         || $a cmp $b );
}

This sorts the way you asked (provided the numbers don't exceed 999999) by overloading the numeric conversion of your 'text only' strings.

So Test1 becomes a dualvar containing ( 1, "Test1" ) which sorts the way you'd expect, but TestA becomes dualvar ( 999999, "TestA" ) - and that sorts behind anything with a 'normal' numeric range, but the comparison 'falls through' when there's two, and they compare based on string equivalence.

If you do the same with 0 (e.g. $_ = dualvar ( s/\D+//r || 0, $_ ) for $a, $b; then TestA and TestB again sort to the top.

Upvotes: 3

Related Questions