Reputation: 734
I need to sort hash key using perl also i need to allow duplicate in key. So that i planned to check exists
method in perl
if it is exists then i increment a last digit then i will store into hash.
I tried the following code:
use strict;
use warnings;
use iPerl::Basic qw(_save_file _open_file);
my $xml = $ARGV[0];
my ($xmlcnt,$backcnt,$refcnt,$name,$year) = "";
my %sort = ();
if(($#ARGV != 0) or(not -f "$xml") or($xml!~ m{\.xml$}i)){
print_exit("\t\tSYSTAX ERROR: <EXE> <xml File>\n\n")
};
$xmlcnt=_open_file($xml);
$xmlcnt =~ s{<back(?: [^>]+)?>(?:(?!</?back[ >]).)*</back>}{
$backcnt = $&;
while($backcnt =~ m{<ref(?: [^>]+)?>(?:(?!<ref[ >]).)*</ref>}igs){
$refcnt = $&;
$name = $1 if($refcnt =~ m{<person-group(?: [^>]+)?>((?:(?!</?person-group[ >]).)*)</person-group>}is);
$year = $1 if($refcnt =~ m{<year>((?:(?!</?year[ >]).)*)</year>}is);
$name =~ s{</?(?:string-name|surname|given-names)>}{}ig;
my $count = 1;
my $keys="$name $year\E$count";
if(exists ($sort{$keys})){
$keys =~ s{(\d)$}{my $icr=$1;$icr++;qq($icr)}e;
#print"$keys\n";
$sort{$keys}="$refcnt";
}
else
{
$sort{$keys}="$refcnt";
}
print join("\n",keys %sort);
}
qq($backcnt)
}igse;
my @keys = sort {
$sort{$a} <=> $sort{$b}
# or
# "\L$a" cmp "\L$b"
} keys %sort;
# print join("\n",@keys);
sub print_exit {
my $msg = shift;
#print "\n$msg";
exit;
}
Please can anyone tell me what went wrong here?
input:
thieooieroh
apple
apple
highefhfe
bufghifeh
output:
apple
apple
bufghifeh
highefhfe
thieooieroh
Thanks in advance.
Upvotes: 2
Views: 167
Reputation: 46245
From a very brief look at your code, it appears that you want to store refcounts as the values in your hash, with the ability to have multiple counts for a single key. This is easily doable by using a hash of arrays (commonly abbreviated to HoA). Each key must, by definition, be unique, but the associated value can be a reference, allowing you to store multiple items under that key, or to build even more complex data structures.
#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
my %hash;
while (my $line = <DATA>) {
chomp $line;
my ($key, $count) = split ',', $line;
push @{$hash{$key}}, $count;
}
for my $key (sort keys %hash) {
my $values = $hash{$key};
for (@$values) {
say "$key ($_)";
}
}
__DATA__
thieooieroh,1
apple,2
apple,3
highefhfe,4
bufghifeh,5
Output:
apple (2)
apple (3)
bufghifeh (5)
highefhfe (4)
thieooieroh (1)
If you're not actually concerned with storing multiple data items with each key, but only with the number of times each key appears, it's even simpler. Change the two loops in the above code to:
while (my $line = <DATA>) {
chomp $line;
$hash{$line}++;
}
for my $key (sort keys %hash) {
say $key for 1 .. $hash{$key};
}
and you get the output
apple
apple
bufghifeh
highefhfe
thieooieroh
As for the rest of your posted code, don't try to parse XML with regexes. Arbitrary XML cannot be parsed beyond a very crude first approximation by regular expressions because XML is not structurally "regular". There are many fine XML-parsing modules on CPAN which will parse your XML correctly for you, while also requiring far less effort from you than trying to write your own parser. Use one of them. Not regexes.
Upvotes: 2