Reputation: 123
I am new to Perl world, and I have a script that compares two arrays.
I use List::MoreUtils
(each_arrayref
) to do the comparison.
I have two questions:
1) Is there a way to compare two chunks of arrays (like natatime but for two arrayrefs) instead of comparing single element at a time as in each_arrayref?
The elements should be from the same index from each array.
The data structure is something like this:
{
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
}
This is what I have got so far.
my @lists = keys %{$hash};
for (my $i = 0; $i <= @lists; $i++) {
my $list_one = $lists[$i];
my $one = $hash->{$list_one};
for (my $j = 0 ; $j <= @lists ; $j++) {
my $list_two = $lists[$j];
my $two = $hash->{$list_two};
my ($overlapping, $mismatch, $identity);
my $match = 0;
my $non_match = 0;
my $count_ac_calls = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if ((defined $call_one) && (defined $call_two)) {
if ($call_one eq $call_two) {
$match++;
}
if ($call_one ne $call_two) {
$non_match++;
}
}
} #end of while loop $each_array->()
print "$list_one,$list_two,$match,$non_match";
} #end of for j loop
} #end of for i loop
I would like to compare atr->ats, atr->att, ats->att. But with my current code, I get repetitions of comparison like ats->atr att->atr,att->ats.
2) How can I avoid those?
Upvotes: 2
Views: 2197
Reputation: 6204
One option is to use Array::Compare to return the number of different array elements. Also, Math::Combinatorics is used to obtain only unique comparisons.
use strict;
use warnings;
use Array::Compare;
use Math::Combinatorics;
my %hash = (
'atr' => [ 'a', 'b', 'c', 'd' ],
'ats' => [ 'a', 'b', 'c', 'd' ],
'att' => [ 'a', 'c', 'c', 'd' ],
);
my $comp = Array::Compare->new( DefFull => 1 );
my $combinat = Math::Combinatorics->new(
count => 2,
data => [ keys %hash ],
);
while ( my ($key1, $key2) = $combinat->next_combination ) {
my $diff = $comp->compare( \@{ $hash{$key1} }, \@{ $hash{$key2} } );
print "$key1,$key2," . ( @{ $hash{$key1} } - $diff ) . ",$diff\n";
}
Output:
ats,att,3,1
ats,atr,4,0
att,atr,3,1
Upvotes: 1
Reputation: 126722
I'm not clear what your first question means. Do you want an iterator that, say, returns (('a','b','c'),('a','b','c'))
instead of
('a','a')
? If so then there isn't one available in a library, but it wouldn't be hard to write your own.
As for the second, the usual way to avoid items being compared with themselves is to change the inner loop to start after the current value of the first. Like so
for my $i (0..$#lists) {
for my $j ($i+1..$#lists) {
}
}
This works because A eq B
is generally the same as B eq A
, so there is no point in comparing an entry with one earlier in the list because the inverse comparison has already been made.
Note that it is much better Perl to write for
loops this way than the messy C-style syntax. You also have a couple of bugs in
for (my $i = 0 ; $i <= @lists ; $i++) { ... }
because the maximum index of @lists
is one less than the scalar value of @lists
- usually coded as $#lists
. The same problem exists in your loop for $j
.
Update
Here is a refactoring of your program, written to include the ideas I have described and to be more Perlish. I hope it is useful to you.
use strict;
use warnings;
use List::MoreUtils 'each_arrayref';
my $hash = {
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
};
my @keys = keys %{$hash};
for my $i (0 .. $#keys) {
my $key1 = $keys[$i];
my $list1 = $hash->{$key1};
for my $j ($i+1 .. $#keys) {
my $key2 = $keys[$j];
my $list2 = $hash->{$key2};
my ($match, $non_match) = (0, 0);
my $iter = each_arrayref($list1, $list2);
while (my ($call1, $call2) = $iter->()) {
if (defined $call1 and defined $call2) {
($call1 eq $call2 ? $match : $non_match)++;
}
}
print "$key1, $key2, $match, $non_match\n";
}
}
Upvotes: 4
Reputation: 2960
You're not really taking advantage of the features Perl has to offer. Rather than use an error prone C-style loop, just use for my $var (LIST)
. You can also skip redundant list checking by skipping the self-checks, too. I've taken your script, made some alterations, and I'm sure you'll find it a bit easier to read.
use v5.16;
use warnings;
use List::MoreUtils qw{each_arrayref};
my $hash = {
'atr' => [
'a',
'b',
'c',
'd'
],
'ats'=>[
'a',
'b',
'c',
'd'
],
'att' => [
'a',
'c',
'c',
'd'
],
};
for my $list_one (keys $hash) {
my $one = $hash->{$list_one};
for my $list_two (keys $hash) {
next if $list_one ~~ $list_two;
my $two = $hash->{$list_two};
my ($match, $non_match);
$match = $non_match = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if($call_one && $call_two) {
if($call_one eq $call_two) {
$match++;
}
else {
$non_match++;
}
}
}
print "$list_one,$list_two,$match,$non_match\n";
}
}
You'll want to evaluate one at a time anyway so that you can add in some extra bits like the index location. (Yes, you could use the C-style loop, but that'd be a bit more difficult to read.)
Upvotes: -1