Reputation: 55
When looping through two arrays, I am confused about how to move the pointer through one loop but keep it constant in another. So for example:
A T C G T C G A G C G
A C G T C C T G T C G
So A in the first array matches A in the second array so we move on to next elements. But since the T doesn't match the C in the 2nd index, I want the program to compare that T to the next G in array 2 and so on until it finds the matching T.
my ($array1ref, $array2ref) = @_;
my @array1 = @$array1ref;
my @array2= @$array2ref;
my $count = 0;
foreach my $element (@array1) {
foreach my $element2 (@array2) {
if ($element eq $element2) {
$count++;
}else { ???????????
}
Upvotes: 4
Views: 1941
Reputation: 23975
As you may know, your problem is called Sequence Alignment. There are well-developed algorithms to do this efficiently, and one such module Algorithm::NeedlemanWunsch is available on CPAN. Here's how you might apply it to your problem.
#!/usr/bin/perl
use Algorithm::NeedlemanWunsch;
my $arr1 = [qw(A T C G T C G A G C G)];
my $arr2 = [qw(A C G T C C T G T C G)];
my $matcher = Algorithm::NeedlemanWunsch->new(sub {@_==0 ? -1 : $_[0] eq $_[1] ? 1 : -2});
my (@align1, @align2);
my $result = $matcher->align($arr1, $arr2,
{
align => sub {unshift @align1, $arr1->[shift]; unshift @align2, $arr2->[shift]},
shift_a => sub {unshift @align1, $arr1->[shift]; unshift @align2, '.'},
shift_b => sub {unshift @align1, '.'; unshift @align2, $arr1->[shift]},
});
print join("", @align1), "\n";
print join("", @align2), "\n";
That prints out an optimal solution in terms of the costs we specified in the constructor:
ATCGT.C.GAGCG
A.CGTTCGG.TCG
A very different method from the one in your original question, but I think it's worth knowing about.
Upvotes: 0
Reputation: 17936
It seems like you could do this pretty easily with a 'grep', if you're guaranteed that array2 is always as long as or longer than array1. Something like this:
sub align
{
my ($array1, $array2) = @_;
my $index = 0;
return grep
{
$array1->[$index] eq $array2->[$_] ? ++$index : 0
} 0 .. scalar( @$array2 ) - 1;
}
Basically, the grep is saying "Return me the list of increasing indices into array2 that match contiguous elements from array1."
If you run the above with this test code, you can see it returns the expected alignment array:
my @array1 = qw(A T C G T C G A G C G);
my @array2 = qw(A C G T C C T G T C G);
say join ",", align \@array1, \@array2;
This outputs the expected mapping:
0,3,4,7,8,9,10. That list means that @array1[0 .. 6]
correspond to @array2[0,3,4,7,8,9,10]
.
(Note: You need to use Modern::Perl
or similar to use say
.)
Now, you haven't really said what you need the output of the operation to be. I've assumed you wanted this mapping array. If you just need a count of the number of elements skipped in @array2
while aligning it with @array1
, you can still use the grep
above, but instead of the list, just return scalar(@$array2) - $index
at the end.
Upvotes: 0
Reputation: 238076
You could use a while
loop to search for matches. If you find a match, advance in both arrays. If you don't, advance the second array. At the end you could print the remaining unmatched characters from the first array:
# [1, 2, 3] is a reference to an anonymous array (1, 2, 3)
# qw(1, 2, 3) is shorthand quoted-word for ('1', '2', '3')
my $arr1 = [qw(A T C G T C G A G C G)];
my $arr2 = [qw(A C G T C C T G T C G)];
my $idx1 = 0;
my $idx2 = 0;
# Find matched characters
# @$arr_ref is the size of the array referenced by $arr_ref
while ($idx1 < @$arr1 && $idx2 < @$arr2) {
my $char1 = $arr1->[$idx1];
my $char2 = $arr2->[$idx2];
if ($char1 eq $char2) {
# Matched character, advance arr1 and arr2
printf("%s %s -- arr1[%d] matches arr2[%d]\n", $char1, $char2, $idx1, $idx2);
++$idx1;
++$idx2;
} else {
# Unmatched character, advance arr2
printf(". %s -- skipping arr2[%d]\n", $char2, $idx2);
++$idx2;
}
}
# Remaining unmatched characters
while ($idx1 < @$arr1) {
my $char1 = $arr1->[$idx1];
printf("%s . -- arr1[%d] is beyond the end of arr2\n", $char1, $idx1);
$idx1++;
}
The script prints:
A A -- arr1[0] matches arr2[0]
. C -- skipping arr2[1]
. G -- skipping arr2[2]
T T -- arr1[1] matches arr2[3]
C C -- arr1[2] matches arr2[4]
. C -- skipping arr2[5]
. T -- skipping arr2[6]
G G -- arr1[3] matches arr2[7]
T T -- arr1[4] matches arr2[8]
C C -- arr1[5] matches arr2[9]
G G -- arr1[6] matches arr2[10]
A . -- arr1[7] is beyond the end of arr2
G . -- arr1[8] is beyond the end of arr2
C . -- arr1[9] is beyond the end of arr2
G . -- arr1[10] is beyond the end of arr2
Upvotes: 3
Reputation: 385685
Nested loops makes no sense. You don't want to loop over anything more than once.
You didn't specify what you wanted to happen after you resync, so you'll want to start with the following and tailor it to your needs.
my ($array1, $array2) = @_;
my $idx1 = 0;
my $idx2 = 0;
while ($idx1 < @$array1 && $idx2 < @$array2) {
if ($array1->[$idx1] eq $array2->[$idx2]) {
++$idx1;
++$idx2;
} else {
++$idx2;
}
}
...
As is, the above snippet will leave $idx1
at the last index it couldn't (eventually) resync. If instead you want to stop as soon as you first resync, you want
my ($array1, $array2) = @_;
my $idx1 = 0;
my $idx2 = 0;
my $mismatch = 0;
while ($idx1 < @$array1 && $idx2 < @$array2) {
if ($array1->[$idx1] eq $array2->[$idx2]) {
last if $mismatched;
++$idx1;
++$idx2;
} else {
++$mismatched;
++$idx2;
}
}
...
Upvotes: 2
Reputation: 126
Here is one possibility. It will use indexes to go through both lists.
my @array1 = qw(A T C G T C G A G C G);
my @array2 = qw(A C G T C C T G T C G);
my $count = 0;
my $idx1 = 0;
my $idx2 = 0;
while(($idx1 < scalar @array1) && ($idx2 < scalar @array2)) {
if($array1[$idx1] eq $array2[$idx2]) {
print "Match of $array1[$idx1] array1 \@ $idx1 and array2 \@ $idx2\n";
$idx1++;
$idx2++;
$count++;
} else {
$idx2++;
}
}
print "Count = $count\n";
Upvotes: 0
Reputation: 5348
This is to complex a condition for for loops use while loops instead
my ($array1ref, $array2ref) = @_;
my @array1 = @$array1ref;
my @array2= @$array2ref;
my $count = 0;
my ($index, $index2) = (0,0);
#loop while indexs are in arrays
while($index <= @#array1 && $index2 <= @#array2) {
if($array1[$index] eq $array2[$index2]) {
$index++;
$index2++;
} else {
#increment index until we find a match
$index2++ until $array1[$index] eq $array2[$index2];
}
}
Upvotes: 0
Reputation: 57600
The foreach
loops won't cut it: We'll either want to loop while there are available elements in both arrays, or iterate through all indices, which we can increment as we like:
EL1: while (defined(my $el1 = shift @array1) and @array2) {
EL2: while(defined(my $el2 = shift @array2)) {
++$count and next EL1 if $el1 eq $el2; # break out of inner loop
}
}
or
my $j = 0; # index of @array2
for (my $i = 0; $i <= $#array1; $i++) {
$j++ until $j > $#array or $array1[$i] eq $array2[$j];
last if $j > $#array;
$count++;
}
or any combination.
Upvotes: 0