Elle
Elle

Reputation: 97

readline () unopened file handle error in Perl

I'm having trouble fixing an error in my code. I'm trying to get the code to read the input file and pull out only what is between the []. However, the error I am getting is a readline() on unopened filehandle... I'm not sure what I'm doing incorrectly here for the while () file handle.

#!/usr/bin/perl
use warnings;

my $file = '';
my $newfile = '';
open($newfile, '>', 'newmyosin.fasta') or die "Can't create file", $!;
open($file, '<', 'myosin.fasta') or die "Can't open file", $!;

while(<$file>) {
        print;
        chomp;
        if ( $_ =~ /\[(.+)\]/ ) {
                $file = $1;
        }
}

So, for example:

This would be what one part of my input file would look like:

>gi|115527082|ref|NP_005954.3| myosin-1 [Homo sapiens] 
>gi|226694176|sp|P12882.3|MYH1_HUMAN RecName: Full=Myosin-1; AltName: Full=Myosin heavy chain 1; AltName: Full=Myosin heavy chain 2x; Short=MyHC-2x; AltName: Full=Myosin heavy chain IIx/d; Short=MyHC-IIx/d; AltName: Full=Myosin heavy chain, skeletal muscle, adult 1 [Homo sapiens] 
>gi|119610411|gb|EAW90005.1| hCG1986604, isoform CRA_b [Homo sapiens]
MSSDSEMAIFGEAAPFLRKSERERIEAQNKPFDAKTSVFVVDPKESFVKATVQSREGGKVTAKTEAGATVTVKDDQVFPM
NPPKYDKIEDMAMMTHLHEPAVLYNLKERYAAWMIYTYSGLFCVTVNPYKWLPVYNAEVVTAYRGKKRQEAPPHIFSISD
NAYQFMLTDRENQSILITGESGAGKTVNTKRVIQYFATIAVTGEKKKEEVTSGKMQGTLEDQIISANPLLEAFGNAKTVR
NDNSSRFGKFIRIHFGTTGKLASADIETYLLEKSRVTFQLKAERSYHIFYQIMSNKKPDLIEMLLITTNPYDYAFVSQGE
ITVPSIDDQEELMATDSAIEILGFTSDERVSIYKLTGAVMHYGNMKFKQKQREEQAEPDGTEVADKAAYLQNLNSADLLK
ALCYPRVKVGNEYVTKGQTVQQVYNAVGALAKAVYDKMFLWMVTRINQQLDTKQPRQYFIGVLDIAGFEIFDFNSLEQLC
INFTNEKLQQFFNHHMFVLEQEEYKKEGIEWTFIDFGMDLAACIELIEKPMGIFSILEEECMFPKATDTSFKNKLYEQHL
GKSNNFQKPKPAKGKPEAHFSLIHYAGTVDYNIAGWLDKNKDPLNETVVGLYQKSAMKTLALLFVGATGAEAEAGGGKKG
GKKKGSSFQTVSALFRENLNKLMTNLRSTHPHFVRCIIPNETKTPGAMEHELVLHQLRCNGVLEGIRICRKGFPSRILYA
DFKQRYKVLNASAIPEGQFIDSKKASEKLLGSIDIDHTQYKFGHTKVFFKAGLLGLLEEMRDEKLAQLITRTQAMCRGFL
ARVEYQKMVERRESIFCIQYNVRAFMNVKHWPWMKLYFKIKPLLKSAETEKEMANMKEEFEKTKEELAKTEAKRKELEEK
MVTLMQEKNDLQLQVQAEADSLADAEERCDQLIKTKIQLEAKIKEVTERAEDEEEINAELTAKKRKLEDECSELKKDIDD
LELTLAKVEKEKHATENKVKNLTEEMAGLDETIAKLTKEKKALQEAHQQTLDDLQAEEDKVNTLTKAKIKLEQQVDDLEG
SLEQEKKIRMDLERAKRKLEGDLKLAQESTMDIENDKQQLDEKLKKKEFEMSGLQSKIEDEQALGMQLQKKIKELQARIE
ELEEEIEAERASRAKAEKQRSDLSRELEEISERLEEAGGATSAQIEMNKKREAEFQKMRRDLEEATLQHEATAATLRKKH
ADSVAELGEQIDNLQRVKQKLEKEKSEMKMEIDDLASNMETVSKAKGNLEKMCRALEDQLSEIKTKEEEQQRLINDLTAQ
RARLQTESGEYSRQLDEKDTLVSQLSRGKQAFTQQIEELKRQLEEEIKAKSALAHALQSSRHDCDLLREQYEEEQEAKAE

Out of this, I would like to create a new file "newmyosin.fasta" which will pull out the organism name within the brackets in the header for this sample (e.g. [Homo sapiens]. The Perl code is used to read in from the myosin.fasta file with multiple samples as above, pick out the name within the bracket [], and write out to a new file (e.g. newmyosin.fasta).

Thanks!

Upvotes: 1

Views: 2976

Answers (2)

Matt Jacob
Matt Jacob

Reputation: 6553

As I said in my comment, you're re-assigning your filehandle to the capture group in the middle of reading the file. Since you opened a separate file for output, I assume you want to print the matching strings to that file instead.

Having said that, your requirements are pretty vague, your sample input doesn't look accurate, and you didn't provide any sample output, but if I understand your intent correctly, I think this is what you want:

my $file = 'myosin.fasta';
my $tmp = "$file.tmp";

open(my $new, '>', $tmp) or die "Can't open $tmp: $!";
open(my $old, '<', $file) or die "Can't open $file: $!";

while (<$old>) {
    if (/\[([^]]+)\]/) {
        print $new "$1\n";
    }
}

close($old);
close($new);

rename($file, "$file.bak");
rename($tmp, $file);

Contents of myosin.fasta after running the script:

Homo sapiens
Homo sapiens
Homo sapiens

Upvotes: 0

TLP
TLP

Reputation: 67900

When you do this:

$file = $1;

You overwrite your file handle. Then you can no longer read from it. And you will get the error mentioned.

You should of course save the match somewhere else, e.g.:

my $match = $1;

And probably also print it:

print $newfile $match;

Upvotes: 2

Related Questions