Reputation: 11
sorry if it seems obvious but Im pretty new at Perl and programming and I've been working over a week and can't get it done. My idea is simple. I've got a .csv where I've got the names in the first column, a number from -1 to 1 in the second and a position on the third. Then another file where I have got the names (line starts with >) and the info with 80 characters per line.
What I want to do is keep the name lines of the first file and grab the 'position' given from -20 to +60. But I cannot get it to work and I've got to the point where don't know where to follow.
use strict; #read file line by line
use warnings;
my $outputfile = "Output1.txt";
my $filename = "InputP.txt";
my $inputfasta = "Inputfasta.txt";
open my $fh, '<', $filename or die "Couldn't open '$filename'";
open my $fh2, '>', $outputfile or die "Couldn't create '$outputfile'";
open my $fh3, '<', $inputfasta or die "Couldn't open '$inputfasta'";
my $Psequence = 0;
my $seqname = 0;
while (my $line = <$fh>) {
chomp $line;
my $length = index ($line, ",");
$seqname = substr ($line, 0, $length);
my $length2 = index ($line, ",", $length);
my $score = substr ($line, $length +1, $length2);
my $length3 = index ($line, ",", $length2);
my $position = substr ($line, $length2 +1, $length3);
#print $fh2 "$seqname"."\t"."$score"."\t"."$position"."\n"; }
my $Rlength2 = index ($score, ",");
my $Rscore = substr ($score, 0, $Rlength2);
#print "$Rscore"."\n";}
while (my $linea = <$fh3>){ #same order.
chomp $linea;
if ($linea=~/^>(.+)/) {
print $fh3 "\n"."$linea"."\n"; }
else { $linea =~ /^\s*(.*)\s*$/;
chomp $linea;
print $fh3 "$linea". "\n"; }
}
if ($Rscore >= 0.5){
$Psequence = substr ($linea, -20, 81);
print "$seqname"."\n"."$Psequence";}
}
Upvotes: 1
Views: 67
Reputation: 241898
Please, learn to indent the code correctly. Then the error will be more obvious:
while (my $linea = <$fh3>){ #same order.
chomp $linea;
if ($linea =~ /^>(.+)/) {
print $fh3 "\n$linea\n";
} else {
# Commented out as it does nothing.
# $linea =~ /^\s*(.*)\s*$/;
# chomp $linea;
print $fh3 "$linea\n";
}
}
if ($Rscore >= 0.5){
$Psequence = substr $linea, -20, 81;
print "$seqname\n$Psequence";
}
$linea
exists only in the while loop, but you try to use it in the following paragraph, too. The variable disappears when the loop ends.
Upvotes: 3
Reputation: 385829
Create a hash from the CSV where the key is the name and the value is the position.
use Text::CSV_XS qw( );
my %pos_by_name;
{
open(my $fh, '<', $input_qfn)
or die("Can't open $input_qfn: $!\n");
my $csv = Text::CSV_XS->new({ auto_diag => 1, binary => 1 });
while (my $row = $csv_in->getline($fh)) {
$pos_by_name{ $row->[0] } = $row->[2];
}
}
Then, it's just a question of extracting the names from the other file, and using the hash to find the associated position.
open(my $fh, '<', $fasta_qfn)
or die("Can't open $fasta_qfn: $!\n");
while (<$fh>) {
chomp;
my ($name) = /^>(.*)/
or next;
my $pos = $pos_by_name{$name};
if (!defined($pos)) {
die("Can't find position for $name\n");
}
... Do something with $name and $pos ...
}
Upvotes: -1