Reputation: 252
I am trying to compare each element of column 1 from one list (screens.txt) with any element of column 1 from the other list (new_list.txt) and if matched, print the whole row of the list (screens.txt) in a separate text file (matched.txt). I managed to select the right columns but the output I am getting are the rows from the list (new_list.txt) instead of list (screens.txt) and only one hit was found so it looks like there is also a problem with a loop.
new_list.txt format => first_column->double_tab->the_rest
I am very new in perl programming. Any help would be very much appreciated!
Here is what I've go so far:
#!usr/bin/perl
use warnings;
$list = "new_list.txt";
$screens = "screens.txt";
$result = "matched.txt";
open (FA, "<$list") or die "Can't read source file $list: $!\n";
open (RES, ">$result") or die "Can't write on file $result: $!\n";
$n = 0;
$column = 10;
while ($line = <FA>) {
@description = split (' ', $line);
@ID = split ('\\t', $description[0]);
#print just first column from the list
# print "$ID[0]\n";
}
close (FA);
open (FA, "$screens") or die "Can't read source file $screens: $!\n";
while ($file = <FA>) {
@table = split (' ', $file);
@accession_no = split ('\ ', $table[0]);
# print the first column from the list
# print "$accession_no[0]\n";
}
open (FA, "<$list") or die "Can't read source file $list: $!\n";
while ($line = <FA>) {
print "$line\n";
@description = split (' ', $line);
@ID = split ('\\t', $description[0]);
if ($accession_no eq $ID[0]) {
$n = $n+1;
for ($i = 0; $i < $column; $i++) {
print RES "$file";
}
print "\n";
}
}
close (FA);
close (RES);
print "Hits found: $n\n";
Here is a sample of next_list.txt: Q9UKA8 RCAN3_HUMAN 0
Q9UKA8-2 RCAN3_HUMAN 0
Q9UKA8-3 RCAN3_HUMAN 0
Q9UKA8-4 RCAN3_HUMAN 0
Q9UKA8-5 RCAN3_HUMAN 0
Q9GZP0 PDGFD_HUMAN 0
here is the input file from screens.txt:
Q9GZP0 GDLDLASEST Scaffold attachment factor B2 (SAF-B2) SAFB2
Q9UKA8-5 QKAFNSSSFN Ran GTPase-activating protein 1 (RanGAP1) RANGAP1
I'm interested in checking if Q9GZP0 and Q9UKA8-5 (first column)
from screens.txt are in first column of new_list.txt and if they
are then print the whole line/row from screens.txt.
Thank you in advance!
Upvotes: 0
Views: 75
Reputation: 6798
Minimal code to filter screens with power of map
block
#!/usr/bin/perl
use strict;
use warnings;
my $input1 = 'new_list.txt';
my $input2 = 'screens.txt';
my %seen;
open my $fh1, "< $input1"
or die "Couldn't open $input1";
map{ $seen{$1} = $2 if /(\S+)\s(.*)/ } <$fh1>;
close $fh1;
open my $fh2, "< $input2"
or die "Couldn't open $input2";
map{ print if /(\S+)\s+(.*)/ and $seen{$1} } <$fh2>;
close $fh2;
Input: new_list.txt
Q9UKA8 RCAN3_HUMAN 0
Q9UKA8-2 RCAN3_HUMAN 0
Q9UKA8-3 RCAN3_HUMAN 0
Q9UKA8-4 RCAN3_HUMAN 0
Q9UKA8-5 RCAN3_HUMAN 0
Q9GZP0 PDGFD_HUMAN 0
Input: screens.txt
Q9GZP0 GDLDLASEST Scaffold attachment factor B2 (SAF-B2) SAFB2
Q9UKA8-5 QKAFNSSSFN Ran GTPase-activating protein 1 (RanGAP1) RANGAP1
Output:
Q9GZP0 GDLDLASEST Scaffold attachment factor B2 (SAF-B2) SAFB2
Q9UKA8-5 QKAFNSSSFN Ran GTPase-activating protein 1 (RanGAP1) RANGAP1
NOTE:
Linux -- Make the program executable with a command chmod og+x program.pl
Windows -- Run the program as perl program.pl
REDIRECT output into a file with a command:
Linux - program.pl > matched.txt
Windows - perl program.pl > matched.txt
Upvotes: 1
Reputation: 3222
See if this helps you:
#!/usr/bin/perl
use strict;
use warnings;
my $file1 = "file_pr1.txt";
my $file2 = "file_pr2.txt";
my $resulted_list = "result_list.txt";
my (@description, @ID, @data);
open (my $FA, "<$file1") or die "Can't read source file $file1: $!\n";
while (my $line = <$FA>) {
chomp($line);
@description = split (/\s+/, $line);
push (@ID, $description[0]);
}
close($FA);
my %params = map { $_ => 1 } @ID; #add each elements into hash
open (my $RES, ">$resulted_list") or die "Open as write error : $!\n";
open (my $FB, "<$file2") or die "Can't read source file $file2: $!\n";
while (my $line = <$FB>) {
chomp($line);
@data = split (/\s+/, $line);
print $RES $line."\n" if(exists($params{$data[0]})); #Write to result file
}
close($RES);
Upvotes: 0