user1958532
user1958532

Reputation: 123

Comparison of two arrays perl

I am new to Perl world, and I have a script that compares two arrays.

I use List::MoreUtils (each_arrayref) to do the comparison.

I have two questions:

1) Is there a way to compare two chunks of arrays (like natatime but for two arrayrefs) instead of comparing single element at a time as in each_arrayref?

The elements should be from the same index from each array.

The data structure is something like this:

{
  atr => [qw/ a b c d /],
  ats => [qw/ a b c d /],
  att => [qw/ a b c d /],
}

This is what I have got so far.

my @lists = keys %{$hash};

for (my $i = 0; $i <= @lists; $i++) {

  my $list_one = $lists[$i];
  my $one = $hash->{$list_one};

  for (my $j = 0 ; $j <= @lists ; $j++) {

    my $list_two = $lists[$j];
    my $two = $hash->{$list_two};

    my ($overlapping, $mismatch, $identity);
    my $match          = 0;
    my $non_match      = 0;
    my $count_ac_calls = 0;
    my $each_array     = each_arrayref($one, $two);

    while (my ($call_one, $call_two) = $each_array->()) {

      if ((defined $call_one) && (defined $call_two)) {
        if ($call_one eq $call_two) {
          $match++;
        }
        if ($call_one ne $call_two) {
          $non_match++;
        }
      }
    }    #end of while loop $each_array->()

    print "$list_one,$list_two,$match,$non_match";

  }    #end of for j loop
}    #end of for i loop

I would like to compare atr->ats, atr->att, ats->att. But with my current code, I get repetitions of comparison like ats->atr att->atr,att->ats.

2) How can I avoid those?

Upvotes: 2

Views: 2197

Answers (3)

Kenosis
Kenosis

Reputation: 6204

One option is to use Array::Compare to return the number of different array elements. Also, Math::Combinatorics is used to obtain only unique comparisons.

use strict;
use warnings;
use Array::Compare;
use Math::Combinatorics;

my %hash = (
    'atr' => [ 'a', 'b', 'c', 'd' ],
    'ats' => [ 'a', 'b', 'c', 'd' ],
    'att' => [ 'a', 'c', 'c', 'd' ],
);

my $comp = Array::Compare->new( DefFull => 1 );
my $combinat = Math::Combinatorics->new(
    count => 2,
    data  => [ keys %hash ],
);

while ( my ($key1, $key2) = $combinat->next_combination ) {
    my $diff = $comp->compare( \@{ $hash{$key1} }, \@{ $hash{$key2} } );
    print "$key1,$key2," . ( @{ $hash{$key1} } - $diff ) . ",$diff\n";
}

Output:

ats,att,3,1
ats,atr,4,0
att,atr,3,1

Upvotes: 1

Borodin
Borodin

Reputation: 126722

I'm not clear what your first question means. Do you want an iterator that, say, returns (('a','b','c'),('a','b','c')) instead of ('a','a')? If so then there isn't one available in a library, but it wouldn't be hard to write your own.

As for the second, the usual way to avoid items being compared with themselves is to change the inner loop to start after the current value of the first. Like so

for my $i (0..$#lists) {

  for my $j ($i+1..$#lists) {

  }

}

This works because A eq B is generally the same as B eq A, so there is no point in comparing an entry with one earlier in the list because the inverse comparison has already been made.

Note that it is much better Perl to write for loops this way than the messy C-style syntax. You also have a couple of bugs in

for (my $i = 0 ; $i <= @lists ; $i++) { ... }

because the maximum index of @lists is one less than the scalar value of @lists - usually coded as $#lists. The same problem exists in your loop for $j.

Update

Here is a refactoring of your program, written to include the ideas I have described and to be more Perlish. I hope it is useful to you.

use strict;
use warnings;

use List::MoreUtils 'each_arrayref';

my $hash = {
  atr => [qw/ a b c d /],
  ats => [qw/ a b c d /],
  att => [qw/ a b c d /],
};

my @keys = keys %{$hash};

for my $i (0 .. $#keys) {

  my $key1 = $keys[$i];
  my $list1 = $hash->{$key1};

  for my $j ($i+1 .. $#keys) {

    my $key2 = $keys[$j];
    my $list2 = $hash->{$key2};

    my ($match, $non_match) = (0, 0);
    my $iter = each_arrayref($list1, $list2);

    while (my ($call1, $call2) = $iter->()) {
      if (defined $call1 and defined $call2) {
        ($call1 eq $call2 ? $match : $non_match)++;
      }
    }

    print "$key1, $key2, $match, $non_match\n";
  }
}

Upvotes: 4

titanofold
titanofold

Reputation: 2960

You're not really taking advantage of the features Perl has to offer. Rather than use an error prone C-style loop, just use for my $var (LIST). You can also skip redundant list checking by skipping the self-checks, too. I've taken your script, made some alterations, and I'm sure you'll find it a bit easier to read.

use v5.16;
use warnings;
use List::MoreUtils qw{each_arrayref};

my $hash = {
  'atr' => [
    'a',
    'b',
    'c',
    'd'
   ],
  'ats'=>[
    'a',
    'b',
    'c',
    'd'
   ],
  'att' => [
    'a',
    'c',
    'c',
    'd'
   ],
};

for my $list_one (keys $hash) {
    my $one = $hash->{$list_one};

    for my $list_two (keys $hash) {
        next if $list_one ~~ $list_two;

        my $two = $hash->{$list_two};

        my ($match, $non_match);
        $match = $non_match = 0;

        my $each_array = each_arrayref($one, $two);
        while (my ($call_one, $call_two) = $each_array->()) {
            if($call_one && $call_two) {
                if($call_one eq $call_two) {
                    $match++;
                }
                else {
                    $non_match++;
                }
            }
        }

        print "$list_one,$list_two,$match,$non_match\n";
    }
}

You'll want to evaluate one at a time anyway so that you can add in some extra bits like the index location. (Yes, you could use the C-style loop, but that'd be a bit more difficult to read.)

Upvotes: -1

Related Questions