Reputation: 893
My aim is to write a subroutine which takes in
Just to be clear - the keys may be anything. My example is just for reference.
Given an array containing a list of keys in their required sort order
my @aSortOrder = ( 'DELTA1_2', 'SET1', 'SET2' );
My idea is to form a string
$a->{DELTA1_2} <=> $b->{DELTA1_2} or $a->{SET1} <=> $b->{SET1} or $a->{SET2} <=> $b->{SET2}
and then execute it with eval
.
Here's my code
my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1, 'SET2' => 48264 },
{ 'SET1' => 8328, 'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20, 'DELTA1_2' => 0, 'SET2' => 0 },
{ 'SET1' => 10, 'DELTA1_2' => 0, 'SET2' => 0 }
];
my @aSortOrder = ( 'DELTA1_2', 'SET1', 'SET2' );
my $pStr = '';
foreach ( @aSortOrder ) {
$pStr = $pStr . ' or $a->{' . $_ . '} <=> $b->{' . $_ . '}';
}
$pStr =~ s/^\s*or\s*//;
my @aSorted = sort { eval "$pStr"; } @$paRecords;
print Dumper \@aSorted;
$VAR1 = [
{
'SET1' => 8328,
'SET2' => 8299,
'DELTA1_2' => -29
},
{
'SET1' => 48265,
'SET2' => 48264,
'DELTA1_2' => -1
},
{
'SET2' => 0,
'DELTA1_2' => 0,
'SET1' => 10
},
{
'SET2' => 0,
'DELTA1_2' => 0,
'SET1' => 20
}
];
I guess that this is far from the ideal approach to solving the problem, so any pointer on how this problem could be better solved would be a great help.
Upvotes: 0
Views: 83
Reputation: 126722
The block passed to sort
may contain any amount of code. It is required only to evaluate to a negative number, zero, or a positive number according to whether $a
should be considered to be less than, equal to, or great than $b
I agree with your decision to bundle this into a subroutine, so I have written sort_hashes_by_keys
, which expects a reference to an array of hashes to be sorted, and a reference to an array of key strings. It returns a list of hashes sorted according to the list of keys
use strict;
use warnings 'all';
use Data::Dump 'dd';
my $records = [
{ SET1 => 48265, DELTA1_2 => -1, SET2 => 48264 },
{ SET1 => 8328, DELTA1_2 => -29, SET2 => 8299 },
{ SET1 => 20, DELTA1_2 => 0, SET2 => 0 },
{ SET1 => 10, DELTA1_2 => 0, SET2 => 0 }
];
my @sort_order = qw/ DELTA1_2 SET1 SET2 /;
my @sorted = sort_hashes_by_keys( $records, \@sort_order );
dd \@sorted;
sub sort_hashes_by_keys {
my ( $hashes, $order ) = @_;
sort {
my $cmp = 0;
for my $key ( @$order ) {
last if $cmp = $a->{$key} <=> $b->{$key};
}
$cmp;
} @$hashes;
}
[
{ DELTA1_2 => -29, SET1 => 8328, SET2 => 8299 },
{ DELTA1_2 => -1, SET1 => 48265, SET2 => 48264 },
{ DELTA1_2 => 0, SET1 => 10, SET2 => 0 },
{ DELTA1_2 => 0, SET1 => 20, SET2 => 0 },
]
Note that I strongly advise against both hungarian notation and camel case when naming your variables. Perl is not strictly typed, and it has sigils like $
, @
and %
which indicate the type of every variable, so hungarian notation is superfluous at best, and also adds distracting and irrelevant noise. Also, by convention, capital letters are reserved for module names and global variables, so local identifiers should be in "snake case", i.e. lower-case letters and underscores. Many non-English speakers also find camel case difficult to parse
Upvotes: 1
Reputation: 385657
Just create a sub that does the comparison.
sub custom_cmp {
my $keys = shift;
for my $key (@$keys) {
my $cmp = $_[0]{$key} <=> $_[1]{$key};
return $cmp if $cmp;
}
return 0;
}
my @aSorted = sort { custom_cmp(\@aSortOrder, $a, $b) } @$paRecords;
The above makes two sub calls for each comparison. If we generate the compare function, we can reduce that to one.
sub make_custom_cmp {
my @keys = @_;
return sub($$) {
for my $key (@keys) {
my $cmp = $_[0]{$key} <=> $_[1]{$key};
return $cmp if $cmp;
}
return 0;
};
}
my $cmp = make_custom_cmp(@aSortOrder);
my @aSorted = sort $cmp @$paRecords;
We could go one further and flatten the loop through code generation. This is what a "proper" eval-based solution would look like. However, this level of optimization is hardly needed.
sub make_custom_cmp {
my @keys = @_;
my @cmps;
for $i (0..$#keys) {
push @cmps, "\$_[0]{\$keys[$i]} <=> \$_[1]{\$keys[$i]}"
}
return eval("sub($$) { ".( join(" || ", @cmps) )."}");
}
my $cmp = make_custom_cmp(@aSortOrder);
my @aSorted = sort $cmp @$paRecords;
In fact, the following is probably the most performant solution:
my @aSorted =
map $paRecords->[ unpack('N', substr($_, -4))-0x7FFFFFFF ],
sort
map pack('N*', map $_+0x7FFFFFFF, @{ $paRecords->[$_] }{@aSortOrder}, $_),
0..$#$paRecords;
Upvotes: 3
Reputation: 53478
Well, you're quite right - using eval
like that is a road to future pain.
The joy of 'sort' is that you can define a sort subroutine, that implicitly defines $a
and $b
and you can use whatever logic you desire to decide if it's a positive, negative or 'zero' comparison (equal). (e.g. like <=>
or cmp
do).
The trick here is - 'true' is anything non zero, so <=>
you can test for 'true' to see if there's a comparison to be made ( 4 <=> 4
is 'false')
So if you're just working numerically (you'd need to test for 'alphanumeric' and use cmp
in some cases there, but doesn't seem to apply to your data):
#!/usr/bin/env perl
use strict;
use warnings;
my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1, 'SET2' => 48264 },
{ 'SET1' => 8328, 'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20, 'DELTA1_2' => 0, 'SET2' => 0 },
{ 'SET1' => 10, 'DELTA1_2' => 0, 'SET2' => 0 }
];
#qw is 'quote-words' and just lets you space delimit terms.
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );
#note - needs to come after definition of `@order` but it can be re-written later as long as it's in scope.
#you can pass an order explicitly into the subroutine if you want though.
sub order_by {
for my $key (@order) {
#compare key
my $result = $a->{$key} <=> $b->{$key};
#return it and exit the loop if they aren't equal, otherwise
#continue iterating sort terms.
return $result if $result;
}
return 0; #all keys were similar, therefore return zero.
}
print join (",", @order), "\n";
foreach my $record ( sort {order_by} @$paRecords ) {
#use hash slice to order output in 'sort order'.
#optional, but hopefully clarifies what's going on.
print join (",", @{$record}{@order}), "\n";
}
This, given your data outputs:
DELTA1_2,SET1,SET2
-29,8328,8299
-1,48265,48264
0,10,0
0,20,0
Note, I've opted to use hash slice for your output, because otherwise hashes are unordered, and so your Dumper
output will be inconsistent (randomly ordered fields).
If you need to be a little more dynamic about your ordering, you can pass it into the sort-sub:
#!/usr/bin/env perl
use strict;
use warnings;
sub order_by {
for my $key (@_) {
#compare key
my $result = $a->{$key} <=> $b->{$key};
#return it and exit the loop if they aren't equal, otherwise
#continue iterating sort terms.
return $result if $result;
}
return 0; #all keys were similar, therefore return zero.
}
my $paRecords = [
{ 'SET1' => 48265, 'DELTA1_2' => -1, 'SET2' => 48264 },
{ 'SET1' => 8328, 'DELTA1_2' => -29, 'SET2' => 8299 },
{ 'SET1' => 20, 'DELTA1_2' => 0, 'SET2' => 0 },
{ 'SET1' => 10, 'DELTA1_2' => 0, 'SET2' => 0 }
];
#qw is 'quote-words' and just lets you space delimit terms.
#it's semantically the same as ( 'DELTA1_2', 'SET1', 'SET2' );
my @order = qw ( DELTA1_2 SET1 SET2 );
print join( ",", @order ), "\n";
foreach my $record ( sort {order_by ( @order ) } @$paRecords ) {
#use hash slice to order output in 'sort order'.
#optional, but hopefully clarifies what's going on.
print join( ",", @{$record}{@order} ), "\n";
}
Upvotes: 0