Reputation: 1
I am having trouble specifiying the correct algorithm. I am iterating over an input file with loops. The issue that I have is on the last loop.
#!/usr/bin/perl
# Lab #4
# Judd Bittman
# http://www-users.cselabs.umn.edu/classes/Spring-2011/csci3003/index.php?page=labs
# this site has what needs to be in the lab
# lab4 is the lab instructions
# yeast protein is the part that is being read
use warnings;
use strict;
my $file = "<YeastProteins.txt";
open(my $proteins, $file);
my @identifier;
my @qualifier;
my @molecularweight;
my @pi;
while (my $line1 = <$proteins>) {
#print $line1;
chomp($line1);
my @line = split(/\t/, $line1);
push(@identifier, $line[0]);
push(@qualifier, $line[1]);
push(@molecularweight, $line[2]);
push(@pi, $line[3]);
}
my $extreme = 0;
my $ex_index = 0;
for (my $index = 1; $index < 6805; $index++) {
if ( defined($identifier[$index])
&& defined($qualifier[$index])
&& defined($molecularweight[$index])
&& defined($pi[$index])) {
# print"$identifier[$index]\t:\t$qualifier[$index]:\t$molecularweight[$index]:\n$pi[$index]";
}
if ( defined($identifier[$index])
&& defined($qualifier[$index])
&& defined($pi[$index])) {
if (abs($pi[$index] - 7) > $extreme && $qualifier[$index] eq "Verified")
{
$extreme = abs($pi[$index] - 7);
$ex_index = $identifier[$index];
print $extreme. " " . $ex_index . "\n";
}
}
}
print $extreme;
print "\n";
print $ex_index;
print "\n";
# the part above does part b of the assignment
# YLR204W,its part of the Mitochondrial inner membrane protein as well as a processor.
my $exindex = 0;
my $high = 0;
# two lines above and below is part c
# there is an error and I know there is something wrong
for (my $index = 1; $index < 6805; $index++) {
if ( defined($qualifier[$index])
&& ($qualifier[$index]) eq "Verified"
&& defined($molecularweight[$index])
&& (abs($molecularweight[$index]) > $high)) {
$high = (abs($molecularweight[$index]) > $high); # something wrong on this line, I know I wrote something wrong
$exindex = $identifier[$index];
}
}
print $high;
print "\n";
print $exindex;
print "\n";
close($proteins);
exit;
On the final loop I want my loop to hold on to the protein that is verified and has the highest molecular mass. This is in the input file. What code can I use to tell the program that I want to hold the highest number and its name? I feel like I am very close but I can't put my finger on it.
Upvotes: 0
Views: 230
Reputation: 4048
Just change:
$high = (abs($molecularweight[$index]) > $high);
To this:
$high = abs($molecularweight[$index]) if (abs($molecularweight[$index]) > $high);
At the end of the loop, $high will be the highest value in $molecularweight array.
Upvotes: 1
Reputation: 701
You likely want a more complex data structure, such as a nested hash. It's hard to give a solid example without more knowledge of the data, but, say your first identifier were abc
, the second one was def
, etc:
my %protein_entries = (
abc => {
qualifier => 'something',
molecular_weight => 1234,
pi => 'something',
},
def => {
qualifier => 'something else',
molecular_weight => 5678,
pi => 'something else',
},
# …
);
Then, rather than having several different arrays and keeping track of which belongs to which, you get at the elements like so:
Then, if you want to get at the highest by molecular weight, you can sort the identifiers by their molecular weight, then slice off the highest one:
my $highest = (sort {
$protein_entries{$a}{molecular_weight}
<=>
$protein_entries{$b}{molecular_weight}
} keys %protein_entries)[1];
You're having problem with your algorithm because you're not structuring your data properly, basically.
In this example, $highest
will hold def
, then later you can go back and fetch $protein_entries{def}{molecular_weight}
or any of the other keys in the anonymous hash referenced by $protein_entries{def}
, thus being easily able to recall any relevant associated data.
Upvotes: 1
Reputation: 342
First, a note about perl - in general, it's more common to use foreach style loops rather than c-style indexed loops. For example:
for my $protein (@proteins) {
#do something with $p
}
(Your situation might require it, I just thought I'd mention this)
To address your specific issue though:
$high = (abs($molecularweight[$index])>$high);
$high is being set to the result of the boolean test being performed. Remove the >$high part (which is being tested in your if statement) and you'll likely end up with what you intended.
Upvotes: 3