Reputation: 125
I have just read the post Sorting alphanumeric hash keys in Perl?. But I am starting with Perl, and I don't understand it very clearly.
So I have a hash like this one:
%hash = (
"chr1" => 1,
"chr2" => 3,
"chr19" => 14,
"chr22" => 1,
"X" => 2,
)
I'm trying to obtain output like this:
chr1
chr2
chr19
chr22
But I'm obtaining output like this:
chr1
chr19
chr2
chr22
I have written this code, but it is creating the above wrong output:
foreach my $chr (sort {$a cmp $b} keys(%hash)) {
my $total= $hash{$chr};
my $differentpercent= ($differenthash{$chr} / $total)*100;
my $round=(int($differentpercent*1000))/1000;
print "$chr\t$hash{$chr}\t$differenthash{$chr}\t$round\n";
}
It prints:
chr1 342421 7449 2.175
chr10 227648 5327 2.34
chr11 220415 4468 2.027
chr12 213263 4578 2.146
chr13 172379 3518 2.04
chr14 143534 2883 2.008
chr15 126441 2588 2.046
chr16 138239 3596 2.601
chr17 122137 3232 2.646
chr18 130275 3252 2.496
chr19 99876 2836 2.839
chr2 366815 8123 2.214
How can I fix this?
Upvotes: 3
Views: 7436
Reputation: 11
This has been the way I've been doing it for the longest time... I'm stealing the code from Borodin's post for reference. Borodin's sort code is very simple to follow if you understand regex. I prefer putting complicated sorts into a sub because it really gets messy otherwise. Anyway here you go:
my %hash = (
"chr1" => 1,
"chr2" => 3,
"chr19" => 14,
"chr22" => 1,
"X" => 2,
);
foreach my $key (sort {&sortalphanum} keys %hash)
{
print " $key = $hash{$key}\n";
}
sub sortalphanum
{
my @aa = $a =~ /^([A-Za-z]+)(\d*)/;
my @bb = $b =~ /^([A-Za-z]+)(\d*)/;
lc $aa[0] cmp lc $bb[0] or $aa[1] <=> $bb[1];
}
Upvotes: 1
Reputation: 126732
Update Note @Miller's comment below on some shortcomings of the Sort::Naturally
module.
What you are asking for is a relatively complicated sort that splits each string into alphabetical and numeric fields, and then sorts the letters lexically and the numbers by value.
The module Sort::Naturally
will do what you ask, or you can write something like this. You appear to have ignored the X
key, so I have sorted it to the end using a case-independent sort.
use strict;
use warnings;
my %hash = map { $_ => 1 } qw(
chr22 chr20 chr19 chr13 chr21 chr16 chr12 chr10 chr18
chr17 chrY chr5 chrX chr8 chr14 chr6 chr3 chr9
chr1 chrM chr11 chr2 chr7 chr4 chr15
);
my @sorted_keys = sort {
my @aa = $a =~ /^([A-Za-z]+)(\d*)/;
my @bb = $b =~ /^([A-Za-z]+)(\d*)/;
lc $aa[0] cmp lc $bb[0] or $aa[1] <=> $bb[1];
} keys %hash;
print "$_\n" for @sorted_keys;
output
chr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr20
chr21
chr22
chrM
chrX
chrY
Using the Sort::Naturally
module (you will probably have to install it) you could write this instead.
use strict;
use warnings;
use Sort::Naturally;
my %hash = map { $_ => 1 } qw(
chr22 chr20 chr19 chr13 chr21 chr16 chr12 chr10 chr18
chr17 chrY chr5 chrX chr8 chr14 chr6 chr3 chr9
chr1 chrM chr11 chr2 chr7 chr4 chr15
);
my @sorted_keys = nsort keys %hash;
print "$_\n" for @sorted_keys;
The output is identical to the above.
Upvotes: 6
Reputation: 61510
This can also be solved with a common Perl idiom called map-sort-map:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash = (
"chr1" => 1,
"chr2" => 3,
"chr19" => 14,
"chr22" => 1,
);
my @sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (/chr(\d+)/) || 0] } keys %hash;
print Dumper \@sorted;
__END__
[
'chr1',
'chr2',
'chr19',
'chr22'
];
Note: Unlike @Borodin I chose sort X to the front because it wasn't specified so I just choose an end.
Upvotes: 3
Reputation: 15121
You could try this:
#!/usr/bin/perl
use warnings;
use strict;
my %records;
while (<DATA>) {
my ($key, undef) = split;
$records{$key} = $_;
}
my @keys = sort {
my ($aa) = $a =~ /(\d+)/;
my ($bb) = $b =~ /(\d+)/;
$aa <=> $bb;
} keys %records;
foreach my $key (@keys) {
printf "$records{$key}";
}
__DATA__
chr1 342421 7449 2.175
chr10 227648 5327 2.34
chr11 220415 4468 2.027
chr12 213263 4578 2.146
chr13 172379 3518 2.04
chr14 143534 2883 2.008
chr15 126441 2588 2.046
chr16 138239 3596 2.601
chr17 122137 3232 2.646
chr18 130275 3252 2.496
chr19 99876 2836 2.839
chr2 366815 8123 2.214
Output:
$ perl t01.pl
chr1 342421 7449 2.175
chr2 366815 8123 2.214
chr10 227648 5327 2.34
chr11 220415 4468 2.027
chr12 213263 4578 2.146
chr13 172379 3518 2.04
chr14 143534 2883 2.008
chr15 126441 2588 2.046
chr16 138239 3596 2.601
chr17 122137 3232 2.646
chr18 130275 3252 2.496
chr19 99876 2836 2.839
Upvotes: 0