Reputation: 730
I have the following sequence of data in Perl:
143:0.0209090909090909
270:0.0909090909090909
32:0.0779090909090909
326:0.3009090909090909
Please, how can I sort them based on the numbers before the colon, to get this as my output?
32:0.0779090909090909
143:0.0209090909090909
270:0.0909090909090909
326:0.3009090909090909
Upvotes: 2
Views: 216
Reputation: 118156
Given the variety, I figured some benchmarks might be appropriate. Note, please double-check the benchmarking code before trusting these numbers: I whipped the script up in a hurry.
#!/usr/bin/env perl
use 5.012;
use strict;
use warnings;
use Benchmark qw( cmpthese );
use constant DATA_SIZE => 1000;
cmpthese( -1, {
right_thing => sub { do_the_right_thing ( make_data(rt => DATA_SIZE) ) },
re_extract => sub { re_extract ( make_data(re => DATA_SIZE) ) },
split_extract => sub { split_extract ( make_data(se => DATA_SIZE) ) },
schxfrom_re => sub { schxform_re ( make_data(sx => DATA_SIZE) ) },
nop => sub { nop ( make_data(nl => DATA_SIZE) ) },
});
sub do_the_right_thing {
my ($DATA) = @_;
no warnings 'numeric';
[ sort { $a <=> $b } @$DATA ];
}
sub re_extract {
my ($DATA) = @_;
my $re = qr/^([0-9]+):/;
[ sort { ($a =~ $re)[0] <=> ($b =~ $re)[0] } @$DATA ];
}
sub split_extract {
my ($DATA) = @_;
[
sort {
my ($x, $y) = map split(/:/, $_, 2), $a, $b;
$x <=> $y
} @$DATA
];
}
sub schxform_re {
my ($DATA) = @_;
[
map $_->[0],
sort { $a->[1] <=> $b->[1] }
map { [ $_, m/^([0-9]+):/ ] } @$DATA
];
}
sub nop {
my ($DATA) = @_;
[ @$DATA ];
}
sub make_data {
state %cache;
my ($k, $n) = @_;
unless (exists $cache{$k}) {
$cache{ $k } = [
map
sprintf('%d:%f', int(rand 10_000), rand),
1 .. $n
];
}
return $cache{ $k };
}
Rate re_extract schxfrom_re split_extract right_thing nop re_extract 32.1/s -- -85% -92% -98% -99% schxfrom_re 213/s 565% -- -46% -87% -94% split_extract 392/s 1121% 84% -- -76% -89% right_thing 1614/s 4933% 657% 312% -- -53% nop 3459/s 10685% 1522% 783% 114% --
Upvotes: 0
Reputation: 7579
What, no Schwhartzian transform yet?
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @data = qw(143:0.0209090909090909
270:0.0909090909090909
32:0.0779090909090909
326:0.3009090909090909);
my @sorted = map $_->[0], sort {$a->[1] <=> $b->[1]} map {[$_, m/^(.+):/]} @data;
print Dumper \@sorted;
Output:
$VAR1 = [
'32:0.0779090909090909',
'143:0.0209090909090909',
'270:0.0909090909090909',
'326:0.3009090909090909'
];
Upvotes: 1
Reputation: 74272
The built-in sort
function can be used:
#!/usr/bin/env perl
use strict;
use warnings;
my @data = qw(
143:0.0209090909090909
270:0.0909090909090909
32:0.0779090909090909
326:0.3009090909090909
);
my $match = qr/^(\d+):/;
@data = sort { ( $a =~ $match )[0] <=> ( $b =~ $match )[0] } @data;
print join( "\n", @data ), "\n";
32:0.0779090909090909
143:0.0209090909090909
270:0.0909090909090909
326:0.3009090909090909
Upvotes: 2
Reputation: 3744
It does not matter that there are colons there.
Perl's rules for converting strings to numbers will just do The Right Thing:
#!/usr/bin/perl
use warnings;
use strict;
my @nums = qw(
143:0.0209090909090909
270:0.0909090909090909
32:0.0779090909090909
326:0.3009090909090909
);
{ no warnings 'numeric';
@nums = sort {$a <=> $b} @nums;
}
print "$_\n" for @nums;
Upvotes: 7
Reputation: 393684
I'd simply use
sort -n < input.txt
Otherwise:
use strict;
use warnings;
my @lines = (<>);
print for sort {
my @aa = split(/:/, $a);
my @bb = split(/:/, $b);
1*$aa[0] <=> 1*$bb[0]
} @lines;
Upvotes: 1
Reputation: 9188
Something along the lines of:
my @sorted = sort { my ($a1) = split(/:/,$a);
my ($b1) = split(/:/,$b);
$a1 <=> $b1 } @data ;
$a1 and $b1 will be the first element of each of the sorting inputs, split on the colon character.
Upvotes: 1